Abstract
Accurate origin classification of chili powder is essential for consumer trust and regulatory compliance. In this study, combined near-infrared (NIR) spectroscopy and chemical composition analysis were integrated with machine learning to classify domestic (n = 54, Korea) and imported (n = 66, China and Vietnam) chili powder samples. Baseline analysis revealed systematic differences: domestic powders showed higher protein, calcium, and moisture contents, whereas imported samples contained more organic acids, sugars, and capsaicinoids. Using 16 NIR bands selected by the least absolute shrinkage and selection operator (LASSO), support vector machine (SVM) models achieved high accuracy, with Savitzky–Golay first derivative plus standard normal variate preprocessing yielding the best performance. The hybrid models enhanced reliability. NIR alone achieved high origin-classification accuracy in this dataset using as few as four selected bands; however, NIR combined with organic acid variables (NIR + org) consistently achieved 100% accuracy and showed improved probability reliability. Shapley additive explanation analysis showed that while O–H and C–H overtone bands drove the NIR spectral band-only models, the hybrid models emphasized organic acids and proximate components, providing clear chemical interpretability. The findings demonstrated that integrating NIR with targeted chemical variables enables robust, reliable, and interpretable origin classification, offering rapid screening and regulatory assurance.
Data availability
The data supporting the findings of this study are available from the corresponding author upon reasonable request.
References
Duranova, H., Valkova, V. & Gabriny, L. Chili peppers (Capsicum spp.): the spice not only for cuisine purposes: an update on current knowledge. Phytochem. Rev. 21, 1379–1413. https://doi.org/10.1007/s11101-021-09789-7 (2022).
Li, C., Wu, Y., Zhu, Q., Xie, C. & Yan, Y. Alterations in physico-chemical properties, microstructure, sensory characteristics, and volatile compounds of red pepper (Capsicum annuum var. CPhytochemistry Reviewsonoides) during various thermal drying durations. Food Chem. X 23, 101566. https://doi.org/10.1016/j.fochx.2024.101566 (2024).
Zhu, Y. et al. Multi-dimensional pungency and sensory profiles of powder and oil of seven chili peppers based on descriptive analysis and Scoville heat units. Food Chem. 411, 135488. https://doi.org/10.1016/j.foodchem.2023.135488 (2023).
Hoon Yun, B. et al. Geographical discrimination of Asian red pepper powders using 1H NMR spectroscopy and deep learning-based convolution neural networks. Food Chem. 439, 138082. https://doi.org/10.1016/j.foodchem.2023.138082 (2024).
Meena, D., Chakraborty, S. & Mitra, J. Geographical origin identification of red chili powder using NIR spectroscopy combined with SIMCA and machine learning algorithms. Food Anal. Methods 17, 1005–1023. https://doi.org/10.1007/s12161-024-02625-6 (2024).
Lohumi, S. et al. Quantitative analysis of Sudan dye adulteration in paprika powder using FTIR spectroscopy. Food Addit. Contam. Part A Chem. Anal. Control Expo. Risk Assess. 34, 678–686. https://doi.org/10.1080/19440049.2017.1290828 (2017).
Mejia, E., Ding, Y., Mora, M. F. & Garcia, C. D. Determination of banned Sudan dyes in chili powder by capillary electrophoresis. Food Chem. 102, 1027–1033. https://doi.org/10.1016/j.foodchem.2006.06.038 (2007).
Peng, P., Ba, F., Zhang, Y., Jiang, F. & Zhao, Y. Identification of adulterants in chili powder based on the histogram of oriented gradients algorithm by using an electronic nose. Appl. Sci. 14, 1007. https://doi.org/10.3390/app14031007 (2024).
Hur, S. H. et al. Discrimination of geographical origin of Korean and Chinese red pepper paste via inductively coupled plasma atomic emission spectroscopy and mass spectrometry. Chem. Biol. Technol. Agric. 11, 49. https://doi.org/10.1186/s40538-024-00559-z (2024).
Kim, E.-H. et al. A comparison of the nutrient composition and statistical profile in red pepper fruits (Capsicums annuum L.) based on genetic and environmental factors. Appl. Biol. Chem. 62, 48. https://doi.org/10.1186/s13765-019-0456-y (2019).
Monago-Maraña, O., Durán-Merás, I., Muñoz de la Peña, A. & Galeano-Díaz, T. Analytical techniques and chemometrics approaches in authenticating and identifying adulteration of paprika powder using fingerprints: A review. Microchem J. 178, 107382. https://doi.org/10.1016/j.microc.2022.107382 (2022).
Lučić, M. et al. Dietary intake and health risk assessment of essential and toxic elements in pepper (Capsicum annuum). J. Food Compos. Anal. 111, 104598. https://doi.org/10.1016/j.jfca.2022.104598 (2022).
Yin, X., Xu, X., Zhang, Q. & Xu, J. Rapid determination of the geographical origin of Chinese red peppers (Zanthoxylum Bungeanum Maxim.) based on sensory characteristics and chemometric techniques. Molecules 23, 1001. https://doi.org/10.3390/molecules23051001 (2018).
Speranza, G., Lo Scalzo, R., Morelli, C. F., Rabuffetti, M. & Bianchi, G. Influence of drying techniques and growing location on the chemical composition of sweet pepper (Capsicum annuum L., var. Senise). J. Food Biochem. 43, e13031. https://doi.org/10.1111/jfbc.13031 (2019).
Hur, S. H. et al. Geographical discrimination of dried chili peppers using femtosecond laser ablation-inductively coupled plasma-mass spectrometry (fsLA-ICP-MS). Curr. Res. Food Sci. 6, 100532. https://doi.org/10.1016/j.crfs.2023.100532 (2023).
Lee, D., Kim, M., Kim, B. H. & Ahn, S. Identification of the geographical origin of Asian red pepper (Capsicum annuum L.) powders using 1H NMR spectroscopy. Bull. Korean Chem. Soc. 41, 317–322. https://doi.org/10.1002/bkcs.11974 (2020).
Kósa, A. et al. Profiling of colour pigments of chili powders of different origin by high-performance liquid chromatography. J. Chromatogr. 915, 149–154. https://doi.org/10.1016/S0021-9673(01)00640-9 (2001).
Kim, M. et al. Discriminant analysis of the geographical origin of Asian red pepper powders using second-derivative FT-IR spectroscopy. Foods 10, 1034. https://doi.org/10.3390/foods10051034 (2021).
Ahmed, M. W., Singh, V. & Kamruzzaman, M. Near-infrared spectroscopy as a green analytical tool for sustainable biomass characterization for biofuels and bioproducts: An overview. Bioresour. Technol. 433, 132722. https://doi.org/10.1016/j.biortech.2025.132722 (2025).
Tomar, M. et al. From grain to gain: Bridging conventional methods with chemometric innovations in cereal quality analysis through near-infrared spectroscopy (NIRS). Food Control 178, 111482. https://doi.org/10.1016/j.foodcont.2025.111482 (2025).
Zhang, W., Kasun, L. C., Wang, Q. J., Zheng, Y. & Lin, Z. A review of machine learning for near-infrared spectroscopy. Sensors 22, 9764. https://doi.org/10.3390/s22249764 (2022).
Zareef, M. et al. An overview on the applications of typical non-linear algorithms coupled with NIR spectroscopy in food analysis. Food Eng. Rev. 12, 173–190. https://doi.org/10.1007/s12393-020-09210-7 (2020).
Guo, C., Pleiss, G., Sun, Y. & Weinberger, K. Q. Proceedings of the 34th International Conference on Machine Learning, Volume 70 1321–1330 (ed. JMLR.org & Sydney, N. S. W.) (2017).
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215. https://doi.org/10.1038/s42256-019-0048-x (2019).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 3295222.3295230. https://doi.org/10.5555/3295222.3295230 (2017).
Cen, H. & He, Y. Theory and application of near infrared reflectance spectroscopy in determination of food quality. Trends Food Sci. Technol. 18, 72–83. https://doi.org/10.1016/j.tifs.2006.09.003 (2007).
Workman, J. & Weyer, L. Practical Guide and Spectral Atlas for Interpretive Near-Infrared Spectroscopy (CRC Press, 2012).
Korkmaz, A., Atasoy, A. F. & Hayaloglu, A. A. Changes in volatile compounds, sugars and organic acids of different spices of peppers (Capsicum annuum L.) during storage. Food Chem. 311, 125910. https://doi.org/10.1016/j.foodchem.2019.125910 (2020).
Bianchi, G. et al. Quality assessment of dried organic bell peppers through composition and sensory analysis. Eur. Food Res. Technol. 247, 1883–1897. https://doi.org/10.1007/s00217-021-03757-3 (2021).
Kim, S., Park, J. B. & Hwang, I. K. Quality attributes of various varieties of Korean red pepper powders (Capsicum annuum L.) and color stability during sunlight exposure. J. Food Sci. 67, 2957–2961. https://doi.org/10.1111/j.1365-2621.2002.tb08845.x (2002).
Hur, S. H. et al. Discrimination of the geographical origin of dry red pepper using inorganic elements: A multielement fingerprinting analysis. J. Food Compos. Anal. 116, 105076. https://doi.org/10.1016/j.jfca.2022.105076 (2023).
Feldsine, P., Abeyta, C. & Andrews, W. H. AOAC International methods committee guidelines for validation of qualitative and quantitative food microbiological official methods of analysis. J. AOAC Int. 85, 1187–1200. https://doi.org/10.1093/jaoac/85.5.1187 (2002).
Yang, H.-I. et al. Nondestructive prediction of physicochemical properties of kimchi sauce with artificial and convolutional neural networks. Int. J. Food Prop. 26, 2924–2938. https://doi.org/10.1080/10942912.2023.2250577 (2023).
Chung, Y. B. et al. Metabolic shift during fermentation in kimchi according to capsaicinoid concentration. Heliyon https://doi.org/10.1016/j.heliyon.2024.e24441 (2024).
Zhang, R. et al. Analysis of changes in nutritional compounds of dried yellow chili after different processing treatments. Sci. Rep. 14, 21639. https://doi.org/10.1038/s41598-024-72464-2 (2024).
Choi, Y.-J. et al. Changes in bacterial composition and metabolite profiles during kimchi fermentation with different garlic varieties. Heliyon https://doi.org/10.1016/j.heliyon.2024.e24283 (2024).
Lee, H.-W., Choi, Y.-J., Hwang, I. M., Hong, S. W. & Lee, M.-A. Relationship between chemical characteristics and bacterial community of a Korean salted-fermented anchovy sauce, Myeolchi-Aekjeot. LWT 73, 251–258. https://doi.org/10.1016/j.lwt.2016.06.007 (2016).
Kim, S.-Y. & Ha, J.-H. Rapid determination of the geographical origin of kimchi by Fourier transform near-infrared spectroscopy coupled with chemometric techniques. Sci. Rep. 14, 24581. https://doi.org/10.1038/s41598-024-74662-4 (2024).
Yang, H.-I., Min, S.-G., Yang, J.-H. & Chung, Y.-B. Rapid quality assessment of salted kimchi cabbage through near-infrared spectroscopy. J. Food Meas. Charact. 19, 3933–3946. https://doi.org/10.1007/s11694-025-03205-w (2025).
Wu, K. et al. Using visible and NIR hyperspectral imaging and machine learning for nondestructive detection of nutrient contents in sorghum. Sci. Rep. 15, 6067. https://doi.org/10.1038/s41598-025-90892-6 (2025).
Rinnan, Å., Berg, F. & Engelsen, S. B. Review of the most common pre-processing techniques for near-infrared spectra. TrAC Trends Anal. Chem. 28, 1201–1222. https://doi.org/10.1016/j.trac.2009.07.007 (2009).
Lyu, Y., Song, W., Hou, Z. & Wang, Z. Incorporating empirical knowledge into data-driven variable selection for quantitative analysis of coal ash content by laser-induced breakdown spectroscopy. Plasma Sci. Technol. 26, 075509. https://doi.org/10.1088/2058-6272/ad370c (2024).
Minderer, M. et al. Revisiting the calibration of modern neural networks. Adv. Neural Inf. Process. Syst. 34, 15682–15694. https://doi.org/10.5555/3540261.3541461 (2021).
Wang, L. et al. Machine learning model interpretability using SHAP values: Applied to the task of classifying and predicting the nutritional content of different cuts of mutton. Food Chem. X 29, 102739. https://doi.org/10.1016/j.fochx.2025.102739 (2025).
Acknowledgements
The authors express their gratitude to the staff and colleagues of the World Institute of Kimchi for their technical support and insightful discussions throughout the study.
Funding
This work was supported by the Ministry of Science and ICT of the Republic of Korea through the World Institute of Kimchi (WIKIM) International Research Program (Grant Numbers KES2603); and the second from the Korea Institute of Planning and Evaluation for Technology in Food, Agriculture and Forestry (IPET) through the High Value-added Food Technology Development Program, funded by the Ministry of Agriculture, Food and Rural Affairs (MAFRA) (RS-2024000408169).
Author information
Authors and Affiliations
Contributions
Ji-Hee Yang: conceptualization, methodology, formal analysis, investigation, data curation, writing – original draft. Hae-Il Yang: conceptualization, methodology, formal analysis, investigation, data curation, writing – original draft. Se-Jin Park: formal analysis, investigation, validation. Sung-Gi Min: investigation, resources, visualization, funding acquisition. Woo Jin Jun: formal analysis, writing – review & editing. Young-Bae Chung: conceptualization, resources, supervision, project administration, writing – review & editing.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Yang, JH., Yang, HI., Park, SJ. et al. Hybrid near-infrared and chemical-based machine learning enhances the reliability of chili powder origin classification. Sci Rep (2026). https://doi.org/10.1038/s41598-026-47486-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-47486-7