Abstract
Authenticating specialty tea products remains a critical challenge in premium food markets, yet current analytical approaches are constrained by limited reproducibility and susceptibility to instrumental variation. Here, we present a deep learning framework that transforms liquid chromatography–mass spectrometry (LC–MS) metabolomic data into image representations, enabling robust authentication of tea products under real-world analytical conditions. Profiling 274 Tieguanyin tea samples across seasonal harvests (spring and autumn) and processing methods (light-scented and strong-scented), our approach achieved 90.9% (95% confidence interval [CI]: 80.4%–96.0%) classification accuracy—substantially outperforming conventional multivariate and machine learning methods (sPLS-DA: 85.5%; random forest: 87.3%). Critically, when subjected to chromatographic drift—a pervasive source of analytical irreproducibility—our model maintained 78.2% accuracy while traditional methods degraded to 69.1%. This framework addresses fundamental limitations in untargeted metabolomics, offering a generalizable solution for food authentication that extends beyond tea to broader applications in agricultural product verification and systems biology.
Similar content being viewed by others
Data availability
The metabolic images are publicly available on Figshare (https://doi.org/10.6084/m9.figshare.30963763) and GitHub (https://github.com/zctea0201/deep-tea-auth).
Code availability
All code for data augmentation, model training, and data visualization is publicly available on Figshare (https://doi.org/10.6084/m9.figshare.30963763) and GitHub (https://github.com/zctea0201/deep-tea-auth).
References
Zhou, J. et al. Seasonal variations and sensory profiles of oolong tea: Insights from metabolic analysis of Tieguanyin cultivar. Food Chem 462, 140977 (2025).
Zeng, L. et al. Dynamic changes in metabolites during the manufacture of three distinct flavor types of Tieguanyin. Food Chem. 487, 144744 (2025).
Wu, L. et al. Understanding the formation mechanism of oolong tea characteristic non-volatile chemical constitutes during manufacturing processes by using integrated widely-targeted metabolome and DIA proteome analysis. Food Chem. 310, 125941 (2020).
Li, C., Lin, J., Hu, Q., Sun, Y. & Wu, L. An integrated metabolomic and transcriptomic analysis reveals the dynamic changes of key metabolites and flavor formation over Tieguanyin oolong tea production. Food Chem. X 20, 100952 (2023).
Zheng, C. et al. The impact of harvest season on oolong tea aroma profile and quality. Plants 14, 2378 (2025).
Li, Q. et al. Insights into “Yin Rhyme”: analysis of nonvolatile components in Tieguanyin oolong tea during the manufacturing process. Food Chem. X 23, 101729 (2024).
Shuai, M. et al. Recent techniques for the authentication of the geographical origin of tea leaves from Camellia sinensis: a review. Food Chem. 374, 131713 (2022).
Chaleckis, R., Meister, I., Zhang, P. & Wheelock, C. E. Challenges, progress and promises of metabolite annotation for LC–MS-based metabolomics. Curr. Opin. Biotechnol. 55, 44–50 (2019).
Peng, Y. et al. Metabolomics integrated with machine learning to discriminate the geographic origin of Rougui Wuyi rock tea. Npj Sci. Food 7, 7 (2023).
Ji, H. et al. Recent advances and application of machine learning in food flavor prediction and regulation. Trends Food Sci. Technol. 138, 738–751 (2023).
Magdas, D. A., Hategan, A. R., David, M. & Berghian-Grosan, C. The journey of artificial intelligence in food authentication: from label attribute to fraud detection. Foods 14, 1808 (2025).
Zhou, P. et al. UPLC–Q-TOF/MS-based untargeted metabolomics coupled with chemometrics approach for Tieguanyin tea with seasonal and year variations. Food Chem. 283, 73–82 (2019).
Tan, H. R. et al. Atmospheric solids analysis probe-mass spectrometry (ASAP-MS) as a rapid fingerprinting technique to differentiate the harvest seasons of Tieguanyin oolong teas. Food Chem. 408, 135135 (2023).
Mienye, I. D. & Swart, T. G. A comprehensive review of deep learning: architectures, recent advances, and applications. Information 15, 755 (2024).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Shen, X. et al. Deep learning-based pseudo-mass spectrometry imaging analysis for precision medicine. Brief. Bioinform. 23, bbac331 (2022).
Lê Cao, K.-A., Boitard, S. & Besse, P. Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinformatics 12, 253 (2011).
Goyal, R., Singha, P. & Singh, S. K. Spectroscopic food adulteration detection using machine learning: current challenges and future prospects. Trends Food Sci. Technol. 146, 104377 (2024).
Goyal, K., Kumar, P. & Verma, K. Food adulteration detection using artificial intelligence: a systematic review. Arch. Comput. Methods Eng. 29, 397–426 (2022).
Dehghan, K., Joolaei Ahranjani, P., Joolaei Ahranjani, P., Esfandiari, Z. & Ferrentino, G. A systematic review of analytical techniques coupled with emerging machine learning and chemometric approaches for detecting yogurt adulteration. Appl. Food Res. 5, 101526 (2025).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Alzubaidi, L. et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J. Big Data 8, 53 (2021).
Shorten, C. & Khoshgoftaar, T. M. A survey on image data augmentation for deep learning. J. Big Data 6, 60 (2019).
Buyuktepe, O. et al. Food fraud detection using explainable artificial intelligence. Expert Syst. 42, e13387 (2025).
Bandopadhyay, S., Banerjee, S. & Debnath, N. C. Explainable deep learning in berry classification through attention mechanisms and Grad-CAM. in The 9th International Conference on Advanced Machine Learning Technologies and Applications (AMLTA’25), Volume 1 (eds Hassanien, A. E., El-Sayed, E. K., Darwish, A. & Snasel, V.), Vol. 273, 3–12 (Springer Nature Switzerland, 2026).
Pang, Z. et al. MetaboAnalyst 5.0: narrowing the gap between raw spectra and functional insights. Nucleic Acids Res. 49, W388–W396 (2021).
Rohart, F., Gautier, B., Singh, A. & Lê Cao, K.-A. mixOmics: an R package for ‘omics feature selection and multiple data integration. PLoS Comput. Biol. 13, e1005752 (2017).
Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Shen, X. et al. TidyMass an object-oriented reproducible analysis framework for LC–MS data. Nat. Commun. 13, 4365 (2022).
Feurer, M. et al. Efficient and robust automated machine learning. in NIPS'15: Proceedings of the 29th International Conference on Neural Information Processing Systems, Vol. 2, 2755–2763 (NIPS, 2015).
Hao, J. & Ho, T. K. Machine learning made easy: a review of Scikit-learn package in python programming language. J. Educ. Behav. Stat. 44, 348–361 (2019).
Erickson, N. et al. AutoGluon-tabular: robust and accurate AutoML for structured data. Preprint https://doi.org/10.48550/arXiv.2003.06505 (2020).
Acknowledgements
The authors thank Professor Yonghui Dong for the advice on the manuscript. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
J.Z. and Y.L. conceived and designed the study. X.Z., N.S., and W.X. performed the experiments. J.C., C.Z., and Y.L. analyzed the data. C.Z. and J.Z. prepared the manuscript. Y.L., J.C., and C.Z. reviewed and edited the manuscript. All authors have read and agreed to the published version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zheng, C., Zhou, X., Shao, N. et al. Deep learning enable precision authentication of seasonal and processing signatures in tieguanyin tea. npj Sci Food (2026). https://doi.org/10.1038/s41538-026-00837-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41538-026-00837-0


