Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Hybrid near-infrared and chemical-based machine learning enhances the reliability of chili powder origin classification
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 06 April 2026

Hybrid near-infrared and chemical-based machine learning enhances the reliability of chili powder origin classification

  • Ji-Hee Yang1,2 na1,
  • Hae-Il Yang1 na1,
  • Se-Jin Park1,2,
  • Sung-Gi Min1,
  • Woo Jin Jun2 &
  • …
  • Young-Bae Chung1 

Scientific Reports , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Chemistry
  • Materials science

Abstract

Accurate origin classification of chili powder is essential for consumer trust and regulatory compliance. In this study, combined near-infrared (NIR) spectroscopy and chemical composition analysis were integrated with machine learning to classify domestic (n = 54, Korea) and imported (n = 66, China and Vietnam) chili powder samples. Baseline analysis revealed systematic differences: domestic powders showed higher protein, calcium, and moisture contents, whereas imported samples contained more organic acids, sugars, and capsaicinoids. Using 16 NIR bands selected by the least absolute shrinkage and selection operator (LASSO), support vector machine (SVM) models achieved high accuracy, with Savitzky–Golay first derivative plus standard normal variate preprocessing yielding the best performance. The hybrid models enhanced reliability. NIR alone achieved high origin-classification accuracy in this dataset using as few as four selected bands; however, NIR combined with organic acid variables (NIR + org) consistently achieved 100% accuracy and showed improved probability reliability. Shapley additive explanation analysis showed that while O–H and C–H overtone bands drove the NIR spectral band-only models, the hybrid models emphasized organic acids and proximate components, providing clear chemical interpretability. The findings demonstrated that integrating NIR with targeted chemical variables enables robust, reliable, and interpretable origin classification, offering rapid screening and regulatory assurance.

Data availability

The data supporting the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Duranova, H., Valkova, V. & Gabriny, L. Chili peppers (Capsicum spp.): the spice not only for cuisine purposes: an update on current knowledge. Phytochem. Rev. 21, 1379–1413. https://doi.org/10.1007/s11101-021-09789-7 (2022).

    Google Scholar 

  2. Li, C., Wu, Y., Zhu, Q., Xie, C. & Yan, Y. Alterations in physico-chemical properties, microstructure, sensory characteristics, and volatile compounds of red pepper (Capsicum annuum var. CPhytochemistry Reviewsonoides) during various thermal drying durations. Food Chem. X 23, 101566. https://doi.org/10.1016/j.fochx.2024.101566 (2024).

    Google Scholar 

  3. Zhu, Y. et al. Multi-dimensional pungency and sensory profiles of powder and oil of seven chili peppers based on descriptive analysis and Scoville heat units. Food Chem. 411, 135488. https://doi.org/10.1016/j.foodchem.2023.135488 (2023).

    Google Scholar 

  4. Hoon Yun, B. et al. Geographical discrimination of Asian red pepper powders using 1H NMR spectroscopy and deep learning-based convolution neural networks. Food Chem. 439, 138082. https://doi.org/10.1016/j.foodchem.2023.138082 (2024).

    Google Scholar 

  5. Meena, D., Chakraborty, S. & Mitra, J. Geographical origin identification of red chili powder using NIR spectroscopy combined with SIMCA and machine learning algorithms. Food Anal. Methods 17, 1005–1023. https://doi.org/10.1007/s12161-024-02625-6 (2024).

    Google Scholar 

  6. Lohumi, S. et al. Quantitative analysis of Sudan dye adulteration in paprika powder using FTIR spectroscopy. Food Addit. Contam. Part A Chem. Anal. Control Expo. Risk Assess. 34, 678–686. https://doi.org/10.1080/19440049.2017.1290828 (2017).

    Google Scholar 

  7. Mejia, E., Ding, Y., Mora, M. F. & Garcia, C. D. Determination of banned Sudan dyes in chili powder by capillary electrophoresis. Food Chem. 102, 1027–1033. https://doi.org/10.1016/j.foodchem.2006.06.038 (2007).

    Google Scholar 

  8. Peng, P., Ba, F., Zhang, Y., Jiang, F. & Zhao, Y. Identification of adulterants in chili powder based on the histogram of oriented gradients algorithm by using an electronic nose. Appl. Sci. 14, 1007. https://doi.org/10.3390/app14031007 (2024).

    Google Scholar 

  9. Hur, S. H. et al. Discrimination of geographical origin of Korean and Chinese red pepper paste via inductively coupled plasma atomic emission spectroscopy and mass spectrometry. Chem. Biol. Technol. Agric. 11, 49. https://doi.org/10.1186/s40538-024-00559-z (2024).

    Google Scholar 

  10. Kim, E.-H. et al. A comparison of the nutrient composition and statistical profile in red pepper fruits (Capsicums annuum L.) based on genetic and environmental factors. Appl. Biol. Chem. 62, 48. https://doi.org/10.1186/s13765-019-0456-y (2019).

    Google Scholar 

  11. Monago-Maraña, O., Durán-Merás, I., Muñoz de la Peña, A. & Galeano-Díaz, T. Analytical techniques and chemometrics approaches in authenticating and identifying adulteration of paprika powder using fingerprints: A review. Microchem J. 178, 107382. https://doi.org/10.1016/j.microc.2022.107382 (2022).

    Google Scholar 

  12. Lučić, M. et al. Dietary intake and health risk assessment of essential and toxic elements in pepper (Capsicum annuum). J. Food Compos. Anal. 111, 104598. https://doi.org/10.1016/j.jfca.2022.104598 (2022).

    Google Scholar 

  13. Yin, X., Xu, X., Zhang, Q. & Xu, J. Rapid determination of the geographical origin of Chinese red peppers (Zanthoxylum Bungeanum Maxim.) based on sensory characteristics and chemometric techniques. Molecules 23, 1001. https://doi.org/10.3390/molecules23051001 (2018).

    Google Scholar 

  14. Speranza, G., Lo Scalzo, R., Morelli, C. F., Rabuffetti, M. & Bianchi, G. Influence of drying techniques and growing location on the chemical composition of sweet pepper (Capsicum annuum L., var. Senise). J. Food Biochem. 43, e13031. https://doi.org/10.1111/jfbc.13031 (2019).

    Google Scholar 

  15. Hur, S. H. et al. Geographical discrimination of dried chili peppers using femtosecond laser ablation-inductively coupled plasma-mass spectrometry (fsLA-ICP-MS). Curr. Res. Food Sci. 6, 100532. https://doi.org/10.1016/j.crfs.2023.100532 (2023).

    Google Scholar 

  16. Lee, D., Kim, M., Kim, B. H. & Ahn, S. Identification of the geographical origin of Asian red pepper (Capsicum annuum L.) powders using 1H NMR spectroscopy. Bull. Korean Chem. Soc. 41, 317–322. https://doi.org/10.1002/bkcs.11974 (2020).

    Google Scholar 

  17. Kósa, A. et al. Profiling of colour pigments of chili powders of different origin by high-performance liquid chromatography. J. Chromatogr. 915, 149–154. https://doi.org/10.1016/S0021-9673(01)00640-9 (2001).

    Google Scholar 

  18. Kim, M. et al. Discriminant analysis of the geographical origin of Asian red pepper powders using second-derivative FT-IR spectroscopy. Foods 10, 1034. https://doi.org/10.3390/foods10051034 (2021).

    Google Scholar 

  19. Ahmed, M. W., Singh, V. & Kamruzzaman, M. Near-infrared spectroscopy as a green analytical tool for sustainable biomass characterization for biofuels and bioproducts: An overview. Bioresour. Technol. 433, 132722. https://doi.org/10.1016/j.biortech.2025.132722 (2025).

    Google Scholar 

  20. Tomar, M. et al. From grain to gain: Bridging conventional methods with chemometric innovations in cereal quality analysis through near-infrared spectroscopy (NIRS). Food Control 178, 111482. https://doi.org/10.1016/j.foodcont.2025.111482 (2025).

    Google Scholar 

  21. Zhang, W., Kasun, L. C., Wang, Q. J., Zheng, Y. & Lin, Z. A review of machine learning for near-infrared spectroscopy. Sensors 22, 9764. https://doi.org/10.3390/s22249764 (2022).

    Google Scholar 

  22. Zareef, M. et al. An overview on the applications of typical non-linear algorithms coupled with NIR spectroscopy in food analysis. Food Eng. Rev. 12, 173–190. https://doi.org/10.1007/s12393-020-09210-7 (2020).

    Google Scholar 

  23. Guo, C., Pleiss, G., Sun, Y. & Weinberger, K. Q. Proceedings of the 34th International Conference on Machine Learning, Volume 70 1321–1330 (ed. JMLR.org & Sydney, N. S. W.) (2017).

  24. Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215. https://doi.org/10.1038/s42256-019-0048-x (2019).

    Google Scholar 

  25. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 3295222.3295230. https://doi.org/10.5555/3295222.3295230 (2017).

    Google Scholar 

  26. Cen, H. & He, Y. Theory and application of near infrared reflectance spectroscopy in determination of food quality. Trends Food Sci. Technol. 18, 72–83. https://doi.org/10.1016/j.tifs.2006.09.003 (2007).

    Google Scholar 

  27. Workman, J. & Weyer, L. Practical Guide and Spectral Atlas for Interpretive Near-Infrared Spectroscopy (CRC Press, 2012).

    Google Scholar 

  28. Korkmaz, A., Atasoy, A. F. & Hayaloglu, A. A. Changes in volatile compounds, sugars and organic acids of different spices of peppers (Capsicum annuum L.) during storage. Food Chem. 311, 125910. https://doi.org/10.1016/j.foodchem.2019.125910 (2020).

    Google Scholar 

  29. Bianchi, G. et al. Quality assessment of dried organic bell peppers through composition and sensory analysis. Eur. Food Res. Technol. 247, 1883–1897. https://doi.org/10.1007/s00217-021-03757-3 (2021).

    Google Scholar 

  30. Kim, S., Park, J. B. & Hwang, I. K. Quality attributes of various varieties of Korean red pepper powders (Capsicum annuum L.) and color stability during sunlight exposure. J. Food Sci. 67, 2957–2961. https://doi.org/10.1111/j.1365-2621.2002.tb08845.x (2002).

    Google Scholar 

  31. Hur, S. H. et al. Discrimination of the geographical origin of dry red pepper using inorganic elements: A multielement fingerprinting analysis. J. Food Compos. Anal. 116, 105076. https://doi.org/10.1016/j.jfca.2022.105076 (2023).

    Google Scholar 

  32. Feldsine, P., Abeyta, C. & Andrews, W. H. AOAC International methods committee guidelines for validation of qualitative and quantitative food microbiological official methods of analysis. J. AOAC Int. 85, 1187–1200. https://doi.org/10.1093/jaoac/85.5.1187 (2002).

    Google Scholar 

  33. Yang, H.-I. et al. Nondestructive prediction of physicochemical properties of kimchi sauce with artificial and convolutional neural networks. Int. J. Food Prop. 26, 2924–2938. https://doi.org/10.1080/10942912.2023.2250577 (2023).

    Google Scholar 

  34. Chung, Y. B. et al. Metabolic shift during fermentation in kimchi according to capsaicinoid concentration. Heliyon https://doi.org/10.1016/j.heliyon.2024.e24441 (2024).

    Google Scholar 

  35. Zhang, R. et al. Analysis of changes in nutritional compounds of dried yellow chili after different processing treatments. Sci. Rep. 14, 21639. https://doi.org/10.1038/s41598-024-72464-2 (2024).

    Google Scholar 

  36. Choi, Y.-J. et al. Changes in bacterial composition and metabolite profiles during kimchi fermentation with different garlic varieties. Heliyon https://doi.org/10.1016/j.heliyon.2024.e24283 (2024).

    Google Scholar 

  37. Lee, H.-W., Choi, Y.-J., Hwang, I. M., Hong, S. W. & Lee, M.-A. Relationship between chemical characteristics and bacterial community of a Korean salted-fermented anchovy sauce, Myeolchi-Aekjeot. LWT 73, 251–258. https://doi.org/10.1016/j.lwt.2016.06.007 (2016).

    Google Scholar 

  38. Kim, S.-Y. & Ha, J.-H. Rapid determination of the geographical origin of kimchi by Fourier transform near-infrared spectroscopy coupled with chemometric techniques. Sci. Rep. 14, 24581. https://doi.org/10.1038/s41598-024-74662-4 (2024).

    Google Scholar 

  39. Yang, H.-I., Min, S.-G., Yang, J.-H. & Chung, Y.-B. Rapid quality assessment of salted kimchi cabbage through near-infrared spectroscopy. J. Food Meas. Charact. 19, 3933–3946. https://doi.org/10.1007/s11694-025-03205-w (2025).

    Google Scholar 

  40. Wu, K. et al. Using visible and NIR hyperspectral imaging and machine learning for nondestructive detection of nutrient contents in sorghum. Sci. Rep. 15, 6067. https://doi.org/10.1038/s41598-025-90892-6 (2025).

    Google Scholar 

  41. Rinnan, Å., Berg, F. & Engelsen, S. B. Review of the most common pre-processing techniques for near-infrared spectra. TrAC Trends Anal. Chem. 28, 1201–1222. https://doi.org/10.1016/j.trac.2009.07.007 (2009).

    Google Scholar 

  42. Lyu, Y., Song, W., Hou, Z. & Wang, Z. Incorporating empirical knowledge into data-driven variable selection for quantitative analysis of coal ash content by laser-induced breakdown spectroscopy. Plasma Sci. Technol. 26, 075509. https://doi.org/10.1088/2058-6272/ad370c (2024).

    Google Scholar 

  43. Minderer, M. et al. Revisiting the calibration of modern neural networks. Adv. Neural Inf. Process. Syst. 34, 15682–15694. https://doi.org/10.5555/3540261.3541461 (2021).

    Google Scholar 

  44. Wang, L. et al. Machine learning model interpretability using SHAP values: Applied to the task of classifying and predicting the nutritional content of different cuts of mutton. Food Chem. X 29, 102739. https://doi.org/10.1016/j.fochx.2025.102739 (2025).

    Google Scholar 

Download references

Acknowledgements

The authors express their gratitude to the staff and colleagues of the World Institute of Kimchi for their technical support and insightful discussions throughout the study.

Funding

This work was supported by the Ministry of Science and ICT of the Republic of Korea through the World Institute of Kimchi (WIKIM) International Research Program (Grant Numbers KES2603); and the second from the Korea Institute of Planning and Evaluation for Technology in Food, Agriculture and Forestry (IPET) through the High Value-added Food Technology Development Program, funded by the Ministry of Agriculture, Food and Rural Affairs (MAFRA) (RS-2024000408169).

Author information

Author notes
  1. Ji-Hee Yang and Hae-Il Yang are contributed equally to this work and share first authorship.

Authors and Affiliations

  1. Kimchi Factory Research Group, World Institute of Kimchi, Gwangju, 61755, Republic of Korea

    Ji-Hee Yang, Hae-Il Yang, Se-Jin Park, Sung-Gi Min & Young-Bae Chung

  2. Division of Food and Nutrition, and Research Institute for Human Ecology, Chonnam National University, Gwangju, 61186, Republic of Korea

    Ji-Hee Yang, Se-Jin Park & Woo Jin Jun

Authors
  1. Ji-Hee Yang
    View author publications

    Search author on:PubMed Google Scholar

  2. Hae-Il Yang
    View author publications

    Search author on:PubMed Google Scholar

  3. Se-Jin Park
    View author publications

    Search author on:PubMed Google Scholar

  4. Sung-Gi Min
    View author publications

    Search author on:PubMed Google Scholar

  5. Woo Jin Jun
    View author publications

    Search author on:PubMed Google Scholar

  6. Young-Bae Chung
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Ji-Hee Yang: conceptualization, methodology, formal analysis, investigation, data curation, writing – original draft. Hae-Il Yang: conceptualization, methodology, formal analysis, investigation, data curation, writing – original draft. Se-Jin Park: formal analysis, investigation, validation. Sung-Gi Min: investigation, resources, visualization, funding acquisition. Woo Jin Jun: formal analysis, writing – review & editing. Young-Bae Chung: conceptualization, resources, supervision, project administration, writing – review & editing.

Corresponding authors

Correspondence to Woo Jin Jun or Young-Bae Chung.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information. (download DOCX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, JH., Yang, HI., Park, SJ. et al. Hybrid near-infrared and chemical-based machine learning enhances the reliability of chili powder origin classification. Sci Rep (2026). https://doi.org/10.1038/s41598-026-47486-7

Download citation

  • Received: 19 September 2025

  • Accepted: 31 March 2026

  • Published: 06 April 2026

  • DOI: https://doi.org/10.1038/s41598-026-47486-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Origin classification
  • Spectral preprocessing
  • Feature selection
  • Probability reliability
  • SHAP analysis
  • Machine learning interpretability
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing