Abstract
Recurrence after curative resection remains a major clinical challenge in non–small cell lung cancer (NSCLC), and improved postoperative risk stratification is needed. Machine learning (ML) approaches may enhance recurrence prediction using routinely available clinicopathologic data. We analyzed 265 patients who underwent curative lung cancer surgery. Recurrence was the primary endpoint. Seventeen clinical, pathological, and treatment-related variables were evaluated. Multiple supervised ML classifiers were trained using the full dataset and reduced feature sets generated by ANOVA, chi-square, and Kruskal–Wallis methods. Model performance was assessed using accuracy, area under the curve (AUC), and F1 score. Prognostic factors were examined with Cox regression, and model interpretability was explored through feature importance and SHAP analysis. Recurrence occurred in 82 patients (30.9%). AdaBoost achieved the highest accuracy (0.79) and F1 score (0.87), whereas SVC-RBF showed the highest AUC (0.81). Performance remained stable across feature-selection strategies. Histologic subtype, tumor size, tumor grade, and ECOG performance status were consistently influential variables, with ECOG status and tumor size dominating SHAP-based predictions. These findings indicate that ML models using routine clinicopathologic variables can reliably predict recurrence after NSCLC surgery and support individualized postoperative risk assessment.
Data availability
Because of ethical and privacy considerations, the raw data are not publicly accessible. However, the datasets generated and analyzed during this study can be obtained from the corresponding author upon reasonable request.
References
Thai, A. A., Solomon, B. J., Sequist, L. V., Gainor, J. F. & Heist, R. S. Lung cancer. Lancet 398, 535–554. https://doi.org/10.1016/S0140-6736(21)00312-3 (2021).
Su, S. et al. Patterns of survival and recurrence after surgical treatment of early stage non-small cell lung carcinoma in the ACOSOG Z0030 (ALLIANCE) trial. J. Thorac. Cardiovasc. Surg. 147, 747–752. https://doi.org/10.1016/j.jtcvs.2013.10.001 (2014). Discussion 752 – 743.
Uramoto, H. & Tanaka, F. Recurrence after surgery in patients with NSCLC. Transl Lung Cancer Res. 3, 242–249. https://doi.org/10.3978/j.issn.2218-6751.2013.12.05 (2014).
Farina, E., Nabhen, J. J., Dacoregio, M. I., Batalini, F. & Moraes, F. Y. An overview of artificial intelligence in oncology. Future Sci. OA. 8, FSO787. https://doi.org/10.2144/fsoa-2021-0074 (2022).
Gould, M. K., Huang, B. Z., Tammemagi, M. C., Kinar, Y. & Shiff, R. Machine learning for early lung cancer identification using routine clinical and laboratory data. Am. J. Respir Crit. Care Med. 204, 445–453. https://doi.org/10.1164/rccm.202007-2791OC (2021).
Lynch, C. M. et al. Prediction of lung cancer patient survival via supervised machine learning classification techniques. Int. J. Med. Inf. 108, 1–8. https://doi.org/10.1016/j.ijmedinf.2017.09.013 (2017).
Detterbeck, F. C. et al. The proposed 9th edition TNM classification of lung cancer. Chest 166, 882–895. https://doi.org/10.1016/j.chest.2024.05.026 (2024).
Liu, H. & Setiono, R. Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence. 388–391 (IEEE).
Guyon, I. & Elisseeff, A. An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003).
Breiman, L., Friedman, J., Olshen, R. A. & Stone, C. J. Classification and Regression Trees (Chapman and Hall/CRC, 2017).
Hosmer, D. W. Jr, Lemeshow, S. & Sturdivant, R. X. Applied Logistic Regression (Wiley, 2013).
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Zhang, H. The optimality of naive Bayes. Aa 1, 3 (2004).
Ripley, B. D. Neural networks and related methods for classification. J. Roy. Stat. Soc.: Ser. B (Methodol.). 56, 409–437 (1994).
Hofmann, T., Schölkopf, B. & Smola, A. J. Kernel Methods in Machine Learning. (2008).
Zhang, Z. Introduction to machine learning: k-nearest neighbors. Annals translational Med. 4, 218 (2016).
Mienye, I. D. & Sun, Y. A survey of ensemble learning: Concepts, algorithms, applications, and prospects. Ieee Access. 10, 99129–99149 (2022).
Chen, T. & XGBoost A Scalable Tree Boosting System. Cornell University (2016).
Ke, G. et al. Lightgbm: A highly efficient gradient boosting decision tree. Advances neural Inform. Process. systems 30 (2017).
Seven, I. et al. Predicting hepatocellular carcinoma survival with artificial intelligence. Sci. Rep. 15, 6226. https://doi.org/10.1038/s41598-025-90884-6 (2025).
Seven, I. et al. Predicting survival outcomes in advanced pancreatic cancer using machine learning methods. Med. (Baltim). 104, e43904. https://doi.org/10.1097/MD.0000000000043904 (2025).
Janik, A. et al. Machine learning-assisted recurrence prediction for patients with early-stage non-small-cell lung cancer. JCO Clin. Cancer Inf. 7, e2200062. https://doi.org/10.1200/CCI.22.00062 (2023).
Kim, G., Moon, S. & Choi, J. H. Deep learning with multimodal integration for predicting recurrence in patients with non-small cell lung cancer. Sens. (Basel). 22, 6594. https://doi.org/10.3390/s22176594 (2022).
Pu, L., Dhupar, R. & Meng, X. Predicting Postoperative Lung Cancer Recurrence and survival using cox proportional hazards regression and machine learning. Cancers (Basel). 17, 33. https://doi.org/10.3390/cancers17010033 (2024).
Kent, M. S. et al. A nomogram to predict recurrence and survival of high-risk patients undergoing sublobar resection for lung cancer: An analysis of a multicenter prospective study (ACOSOG Z4032). Ann. Thorac. Surg. 102, 239–246. https://doi.org/10.1016/j.athoracsur.2016.01.063 (2016).
Zhang, Z. et al. A nomogram to predict the recurrence-free survival and analyze the utility of chemotherapy in stage IB non-small cell lung cancer. Translational lung cancer Res. 11, 75 (2022).
Merritt, R. E., Abdel-Rasoul, M., Fitzgerald, M., D’Souza, D. M. & Kneuertz, P. J. Nomograms for predicting overall and recurrence-free survival from pathologic stage IA and IB lung cancer after lobectomy. Clin. Lung Cancer. 22, e574–e583. https://doi.org/10.1016/j.cllc.2020.10.009 (2021).
Huang, Z., Peng, K., Hong, Z., Zhang, P. & Kang, M. Nomogram for predicting recurrence and metastasis of stage IA lung adenocarcinoma treated by videoassisted thoracoscopic lobectomy. Asian J. Surg. 45, 2691–2699. https://doi.org/10.1016/j.asjsur.2022.01.010 (2022).
Bian, R. et al. A nomogram for predicting recurrence in stage I non-small cell lung cancer. Clin. Respir J. 18, e70022. https://doi.org/10.1111/crj.70022 (2024).
Acknowledgements
The authors thank Furkan Aydos and Hatice Rüveyda Akça for their support in the preparation of this study.
Funding
This study was conducted without any external financial support.
Author information
Authors and Affiliations
Contributions
Conceptualization: Ugur Ozberk, Selin Akturk Esen; Methodology: Ugur Ozberk, Selin Akturk Esen; Formal analysis and investigation: Ugur Ozberk, Serkan Keskin, Hilal Arslan, Melike Cobankaya; Data curation: Ugur Ozberk, Serkan Keskin, Oznur Bal, Efnan Algın; Writing - original draft preparation: Ugur Ozberk; Writing - review and editing: Selin Akturk Esen, Burak Bilgin, Mehmet Ali Nahit Sendur, Dogan Uncu; Resources: Oznur Bal, Efnan Algın, Burak Bilgin; Supervision: Mehmet Ali Nahit Sendur, Dogan Uncu.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
The study was approved by the Clinical Research Ethics Committee of Ankara City Hospital (Decision No: TABED 1/1772/2025, Date: 22/10/2025) and was conducted in accordance with the principles of the Declaration of Helsinki.
Informed consent
was waived by the Clinical Research Ethics Committee of Ankara City Hospital due to the retrospective design of the study and the use of anonymized data obtained from medical records.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ozberk, U., Esen, S.A., Arslan, H. et al. Machine learning–based prediction of recurrence after curative resection in non–small cell lung cancer. Sci Rep (2026). https://doi.org/10.1038/s41598-026-47862-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-47862-3