Abstract
Current survival prediction model for extra-nodal natural killer/T-cell lymphoma (ENKTL) have poor accuracy. We developed and validated a machine-learning (ML) algorithm using clinical and pathological co-variates. Data from 977 subjects with ENKTL from 4 cohorts were analyzed. Model performance was evaluated using Harrell’s c-index, receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis (DCAs). The Gradient Boosting Machine (GBM) outperformed the other 15 ML algorithms tested, leading to the development of the ENKTL-ML score. C-indexes of the evaluation and external validation cohorts were 0.82 (95% Confidence Interval [CI], 0.76, 0.87), 0.84 (0.81, 0.88) and 0.83 (0.72, 0.94). The ENKTL-ML score effectively stratified subjects into 3 groups with distinct survival outcomes. Our model was more accurate than the IPI, KPI, PINK-E, and NRI (all P < 0.001) models. An online calculator is available at https://highcloud.shinyapps.io/ENKTL_ML_Scores/. The ENKTL-ML score should help physicians predict survival of people with ENKTL.
Similar content being viewed by others
Introduction
Extranodal natural killer/T-cell lymphoma (ENKTL) is an Epstein-Barr virus (EBV)-related non-Hodgkin lymphoma, primarily prevalent in East Asia and among indigenous populations of Central and South America1. Non-anthracycline-based chemotherapy, usually with L-asparaginase and/or radiation therapy, is the standard-of-care. There is substantial heterogeneity in the outcomes2.
Given the pronounced heterogeneity characterizing ENKTL, the pivotal role of risk stratification becomes evident in distinguishing subjects with varied risk profiles and in guiding treatment decisions3,4,5,6. Several conventional models are reported to predict survival of persons with ENKTL, including the international prognostic index (IPI), Korean prognostic index (KPI), nomogram-revised risk index (NRI) and prognostic index of natural killer lymphoma with EBV data (PINK-E)3,4,5,6. However, these models rely predominantly on clinical parameters and have limited predictive accuracy1. Recent studies have focused on molecular subtyping, single-nucleotide polymorphism (SNP) analysis, and circulating tumor DNA (ctDNA) methylation for risk stratification of ENKTL7,8,9. Nevertheless, their clinical utility remains limited owing to their dependence on next-generation sequencing (NGS). Thus, there is a need for a clinically superior prediction index for ENKTL.
Machine learning (ML) is a potent tool for predicting the survival of people with diverse cancers10. ML algorithms, subject to iterative refinement by computers, enhance task-specific performance such as outcome classification11. Notably, ML algorithms outperform conventional models by appropriately managing nonlinear data and discerning novel patterns among the features and outcomes12. To date, a dependable ML-based prediction model encompassing clinical, laboratory, and pathological co-variates is absent from clinical practice for sENKTL. The current study addresses this gap by introducing and validating the clinically viable ENKTL-ML score, which outperforms current survival models with a higher c-index and greater clinical utility in decision curve analyses (DCAs). Our results demonstrate the ENKTL-ML score’s promise as an innovative prediction index for risk stratification of ENKTL.
Results
Patient characteristics
The flowchart of the study is shown in Fig. 1. Our final analysis included 450 subjects in the training cohort, 194 subjects in the evaluation cohort, 187 subjects in external validation cohort 1, and 146 subjects in external validation cohort 2. Across all cohorts, males accounted for 66–72% of the subjects, with a mean age ranging from 42.4 to 48.3 years. The demographic and baseline characteristics of the participants are summarized in Table 1. Chemotherapy alone was received by 29% in the evaluation cohort, 20% in external validation cohort 1, and 69% in external validation cohort 2. Radiotherapy alone was rare in all cohorts. Chemoradiotherapy was received by 45% in the evaluation cohort, 68% in external validation cohort 1, and 30% in external validation cohort 2. Combination anti-PD-1 therapy was administered to 25% in the evaluation cohort, 11% in external validation cohort 1, and 1% in external validation cohort 2. L-asparaginase-based regimens comprised the predominant chemotherapy approach. Anti-PD-1 administration details are summarized in Table S1.
This diagram details the participant screening process from initial identification to final inclusion in each cohort.
Feature selection and ENKTL-ML score
The importance of 14 features was evaluated, and bone marrow involvement was excluded with <0.01 importance across all three metrics in the training cohort (Fig. S1). To verify this cut-off, we did the sensitivity analyses by employing backward elimination (exclude features by the rank of importance). The analysis demonstrated the model achieved the highest c-index when incorporating 13 features (Table S2). Hence, these 13 features were utilized in the machine-learning model development and training process (Table S3). Gradient Boosting Machine (GBM) had the highest c-index (0.82) and AUROC (0.86) among the 15 ML models within the evaluation cohort (Fig. 2). The detailed hyperparameters of this preeminent algorithm are delineated in Table S4 and Fig. S2. We further employed the SHAP method to illustrate the influence of input features on the prediction of the 5-year OS. The EBV-DNA copy number exerted the greatest influence on the GBM model (Fig. S3). The resultant output of the GBM model was denoted as the ENKTL-ML score. The source code and instructions for the ENKTL-ML score are available at https://highcloud.shinyapps.io/ENKTL_ML_Scores/.
a AUROC for 5-year OS, b c-index for 5-year OS. AUROC area under the receiver operating characteristic curve, OS overall survival, RF random forest, LR logistic regression, SVM support vector machines, kNN k-nearest neighbor, AdaBoost adaptive boosting, Nb naive bayes, XGBoost eXtreme gradient boosting, CoxBoost cox model with likelihood-based boosting, superPC supervised principal component, plsRcox partial least squares regression for cox models, Lasso lasso regression, GBM gradient boosting machine, RSF random survival forest, MLP multilayer perceptron, StepCox stepwise cox regression, TabPFN Tabular Prior-data Fitted Network.
The c-indexes for the ENKTL-ML score in the evaluation cohort, external validation cohort 1, and external validation cohort 2 were 0.82 [95% confidence interval (CI): 0.76, 0.87], 0.84 [0.81, 0.88], and 0.83 [0.72, 0.94], respectively (Table S5). 5-year AUROCs were 0.86 [0.79, 0.92], 0.86 [0.80, 0.91], 0.82 [0.71, 0.93], respectively (Fig. 3, Table S5). Regarding the calibration performance of the ENKTL-ML score, calibration plots for the probability of 5-year OS demonstrated a good correlation between score predictions and actual observations and a slight tendency to underestimate the 5-year OS in the evaluation cohort (Fig. S4a) and both external validation cohorts (Fig. S4b, c).
ROC curves for predicting 5-year OS in the evaluation cohort (a), external validation cohort 1 (c), and external validation cohort 2 (e); Decision curve analysis of risk models for predicting 5-year mortality in the evaluation (b), external validation cohort 1 (d) and external validation cohort 2 (f). OS overall survival, AUROC area under the ROC.
Comparison of ENKTL-ML score with existing indexes
Notably, our model had a significantly superior c-index compared to the IPI (c-index: 0.61 [0.54, 0.68], KPI (0.64 [0.56, 0.72], NRI (0.65 [0.59, 0.72]), and PINK-E (0.66 [0.59, 0.73]); all P < 0.001) indexes (Table S5). The 5-year AUROC of the ENKTL-ML score surpassed the predictive capabilities of existing models (Fig. 3, Table S5). Furthermore, the ENKTL-ML score exhibited a significantly higher time-dependent AUROC for predicting OS and PFS between 0 and 96 months than the existing indexes in the evaluation and both external validation cohorts (Fig. 4).
Prediction of OS in the evaluation cohort (a), external validation cohort 1 (c), external validation cohort 2 (e); Prediction of PFS in the evaluation (b), external validation cohort 1 (d); external validation cohort 2 (f). AUROC area under the receiver operating characteristic curve, OS overall survival, PFS progression-free survival.
We employed a decision curve analysis to assess the ability of the models to inform treatment decisions in both the evaluation and external validation cohorts. In the evaluation cohort, the ENKTL-ML score had a threshold probability of 0.06–0.99, higher than the IPI (0.25, 0.69), KPI (0.16, 0.47), NRI (0.16, 0.49), and PINK-E (0.21, 0.55) models (Fig. 3b). In external validation cohort 1, the ENKTL-ML score had a threshold probability of 0.06–0.99, which was higher than the IPI (0.17, 0.97), KPI (0.09, 0.55), NRI (0.10, 0.61), and PINK-E (0.17, 0.73) models. In external validation cohort 2, the ENKTL-ML score had a threshold probability of 0.09–0.96, higher than the IPI (0.22, 0.72), KPI (0.14, 0.50), NRI (0.05, 0.63), and PINK-E (0.18, 0.63) models. These findings suggest that the ENKTL-ML score is valuable for supporting clinical decision-making.
Risk stratification of ENKTL subjects by ENKTL-ML score
We established dual cut-offs for the ENKTL-ML score, corresponding to 85% sensitivity (rule-out cut-off: −1.379) and 95% specificity (rule-in cut-off: −1.030), to predict the 5-year OS in the evaluation cohort (Table S6). Subjects with an ENKTL-ML score below −1.379 were assigned to the low-risk category, those with scores between −1.379 and −1.030 were classified as intermediate-risk, and those equal to or higher than −1.030 were assigned to the high-risk category. In the evaluation cohort, external validation cohort 1, and external validation cohort 2, 18.0%, 27.2%, and 26.7% of subjects, respectively, were categorized as high-risk.
Concerning prognosis, the respective 5-year OS rates in the evaluation cohort for low-risk, intermediate-risk, and high-risk groups were 89.4% (95% CI: 82.3, 97.1%), 62.6% (48.4, 80.9%), and 8.2% (1.6, 41.5%), with a trend P < 0.001 (Fig. 5a and Table S7). Compared to the low-risk cohort, the corresponding hazard ratios (HRs) were 4.3 (1.9, 9.3; P < 0.001) and 18.2 (8.6, 38.6) for the intermediate- and high-risk groups, respectively (P < 0.001, Table S7). In external validation cohort 1, the 5-year OS rates for the low-, intermediate-, and high-risk groups were 78.1% (67.6, 90.1%), 46.0% (31.8, 66.5%), and 3.7% (0.6, 22.1%), respectively, with a trend P < 0.001 (Fig. 5c; Table S7). Compared to the low-risk cohort, the corresponding HRs were HR = 3.6 (1.8, 7.2; P < 0.001) and 20.2 (10.7, 38.2; P < 0.001) for the intermediate- and high-risk groups, respectively (P < 0.001, Table S7). In external validation cohort 2, the 5-year OS rates for the low-risk, intermediate-risk, and high-risk groups were 98.1% (94.5, 100%), 89.8% (78.7, 100%), and 49.9% (31.9, 78.3%), respectively, with a trend P < 0.001 (Fig. 5e; Table S7). Using the low-risk cohort as a reference, the corresponding HRs were HR = 6.3 (0.7, 53.6; P = 0.1) and 31.7 (4.1, 244.0; P < 0.001) for the intermediate- and high-risk groups, respectively (P < 0.001, Table S7). Applying the same classification for PFS, both the evaluation and validation cohorts demonstrated significantly inferior PFS in the intermediate- and high-risk groups compared to that in the low-risk group (Fig. 5b, d, f; Table S8).
OS in the evaluation cohort (a), external validation cohort 1 (c) and external validation cohort 2 (e); PFS in the evaluation (b), external validation cohort 1 (d) and external validation cohort 2 (f). OS overall survival, PFS progression-free survival.
Subgroup analyses of Ann Arbor stage I subjects
We conducted a focused analysis of the performance of stratifying Ann Arbor stage I ENKTL subjects based on the ENKTL-ML score. In the evaluation cohort, the 5-year OS rates for the low-, intermediate-, and high-risk groups were 86.6% (95% CI: 75.7, 99.1%), 47.6% (25.6, 88.4%), and 0%, respectively (P < 0.001, Fig. S5a). Correspondingly, in the external validation cohort 1, the OS rates were 75.9% (62.1, 92.7%), 45.8% (24.3, 86.6%), and 0%, respectively (P < 0.001, Fig. S5c). In external validation cohort 2, the 5-year OS rates for the low- and intermediate-risk groups were 100% and 50.0% (35.4% and 12.5%, respectively) (P < 0.001, Fig. S5e). Similarly, PFS demonstrated significant differences among the low-, intermediate-, and high-risk groups (Fi. S5b, d, f). These results suggest that the ENKTL-ML score effectively stratifies Ann Arbor stage I subjects into two or three risk groups with distinct outcomes.
Subgroup analyses of the different treatments
Subgroup analyses were performed on the derivation cohort. Notably, among subjects classified as low- and intermediate-risk, there were no significant differences in OS (HR = 0.9 [95% CI: 0.5–1.6]; P = 0.7) or PFS (HR = 1.4 [0.9–2.1]; P = 0.1) between those who received combinational anti-PD-1 therapy and those who underwent conventional therapy (chemotherapy with or without RT) (Fig. 6a, b). However, combinational anti-PD-1 therapy demonstrated a significant association with prolonged OS (HR = 0.5 [0.3–0.8]; P = 0.006) and PFS (HR = 0.6 [0.4–0.9]; P = 0.02) in subjects classified as high-risk (Fig. 6c, d). To address potential selection bias in treatment assignment between these treatments, we performed propensity score matching (PSM) within each risk stratum (1:1 matched). The baseline characteristics of different risk subgroups after PSM were well-balanced (Table S9–10). Among subjects classified as low- and intermediate-risk after PSM, there were no significant differences in OS (HR = 0.84 [0.36–1.93]; P = 0.68) or PFS (HR = 1.06 [0.59–1.91]; P = 0.85) between the combinational anti-PD-1 therapy and conventional therapy (Fig. S6a, b). In addition, combinational anti-PD-1 therapy demonstrated a consistently association with prolonged OS (HR = 0.50 [0.27–0.92]; P = 0.03) and PFS (HR = 0.62 [0.36–1.06]; P = 0.08) in subjects classified as high-risk after PSM (Fig. S6c, d). This score also distinguishes significant differences across risk groups in subjects receiving chemotherapy along (Fig. S7a, b) or chemoradiotherapy (Fig. S7c, d).
OS (a) and PFS (b) for low- or intermediate-risk subjects; OS (c) and PFS (d) for high-risk subjects. OS overall survival, PFS progression-free survival.
Discussion
In this study, we utilized data from an expansive multicenter cohort comprising subjects diagnosed with ENKTL. Our objective was to formulate and assess a readily applicable ENKTL-ML score, derived using the GBM algorithm, designed to predict the survival outcomes of ENKTL subjects undergoing contemporary treatment modalities. Our findings indicate that the ENKTL-ML score exhibits exceptional discriminatory capabilities for predicting survival. Notably, we subjected the score to external validation in independent cohorts sourced from the West China Hospital and Sichuan Cancer Hospital. Significantly, the ENKTL-ML score demonstrated superior performance in risk stratification compared to the existing ENKTL prediction models, including the IPI, KPI, PINK-E, and NRI. Based on the ENKTL-ML score, the subjects were categorized into low-, intermediate-, and high-risk groups, displaying marked differences in survival outcomes.
Prognostic stratification plays a pivotal role in the management of ENKTL, facilitating the identification of patients at an elevated risk of disease progression or death13. Additionally, it aids clinicians in tailoring treatment regimens to align with the specific needs of individual subjects14, thereby establishing a framework for personalized care that can enhance outcomes for persons with this aggressive disease15.
Previous indexes such as IPI, KPI, PINK-E, and NRI are widely used and rely on easily measurable clinical parameters, such as age, disease stage, lymph node involvement, and LDH levels. One of the notable limitations of these indexes is that they overlook the prognostic value of biomarkers, which contributes to ENKTL heterogeneity. For instance, the PINK-E index did not assess Ki67 expression owing to insufficient data5. Furthermore, the predictive efficacy of these indexes is limited, partly because of their derivation from single regression analysis and limited clinical parameters. Additionally, the IPI, KPI, and NRI encompassed subjects receiving both anthracycline-based therapies. Because anthracyclines are no longer routinely administered, these models do not reflect current practice. Sun et al. reported a model based on the lightGBM algorithm; however, their study had a small sample size and no validation16. To overcome these limitations, it is imperative to devise a more comprehensive prognostic index for ENKTL.
The ENKTL-ML score developed and validated in this study integrates the Ki67 proliferative index, EBV-DNA data, LDH, and clinically measurable parameters from the existing indexes. This amalgamation affords more precise and personalized risk stratification for subjects with ENKTL. The Ki67 protein has established diagnostic and prognostic significance as a proliferation marker in multiple malignancies17,18,19,20. In mantle cell lymphoma, the Ki67 proliferative index has been proven to be a robust prognostic factor and is involved in risk-stratification strategies21,22. Several previous studies have posited a potential correlation between elevated Ki67 expression and unfavorable prognosis in ENKTL23,24,25. Consequently, we incorporated the Ki67 proliferative index as a novel pathological biomarker for the development of the ENKTL-ML score.
The proposed model has several advantages. All enrolled subjects received contemporary non-anthracycline-based therapy, mitigating the confounding impact of outdated therapeutic regimens. Furthermore, risk stratification facilitated by the ENKTL-ML score predicted both the OS and PFS. It manifests sustained long-term predictive accuracy, particularly beyond the 5-year mark, enabling clinicians to categorize patients and customize treatment strategies accordingly. The ENKTL-ML score exhibits superior performance in terms of AUROC, c-index, and clinical decision guidance compared with existing indexes. Furthermore, we developed an online tool that renders the implementation of the ENKTL-ML score feasible in clinical practice. The tests for Ki67 proliferative index, EBV-DNA data, and LDH are readily accessible in routine diagnostics performed using standard laboratory techniques. Therefore, the ENKTL-ML score has emerged as a promising prediction index, surpassing the NRI, IPI, KPI, and PINK-E indexes.
Presently, people with ENKTL undergo diverse management approaches, primarily dictated by the Ann Arbor staging system. Distinct strategies are applied for early-versus advanced-stage ENKTL. Early-stage subjects may attain prolonged survival through combined chemotherapy and RT or even with initial RT alone15. However, advanced ENKTL mortality remains high, notwithstanding asparaginase-based chemotherapy with or without RT26. High programmed death ligand 1 (PD-L1) expression on ENKTL tumor cells promotes immune escape and increases sensitivity to anti-PD-1 antibodies27,28,29,30. The effectiveness of PD-1 antibodies has been demonstrated in relapsed/refractory ENKTL subjects, and the application of anti-PD-1 in chemotherapy as a first-line treatment has shown promising efficacy in newly diagnosed cases of advanced ENKTL31,32,33,34,35.
Our interest lies in understanding how ENKTL-ML scores contribute to the development of risk-adapted treatment approaches. Subgroup analysis of subjects with stage I ENKTL revealed the efficacy of the ENKTL-ML score in identifying those with a favorable prognosis. Approximately 72% of stage I subjects classified as low risk by the ENKTL-ML model exhibit a favorable prognosis (5-year OS, 76–100%), complementing the existing Ann Arbor staging system. This underscores its potential in guiding treatment decisions, especially for low-risk early-stage subjects who might be considered for RT alone or chemotherapy with reduced surveillance in the RT setting. In contrast, the high mortality in advanced-stage or high-risk subjects highlights their intrinsic chemotherapy resistance, emphasizing the potential utility of immunotherapy. The present analysis indicates improved survival in high-risk subjects treated with combinational anti-PD-1 therapy compared with conventional therapy, providing insights for clinicians navigating patient selection in the era of immunotherapy. It is crucial to note that although we employed propensity score matching to balance measured baseline covariates, we cannot rule out residual confounding by unmeasured factors. Furthermore, in subjects receiving chemotherapy alone or chemoradiotherapy, this model stratified them into distinct risk groups with significant outcome differences.
Our study has several limitations. Its retrospective nature may introduce biases, and the conclusions are specific to subjects with ENKTL on non-anthracycline-based therapy. The selection of combination regimens for anti-PD-1 therapy was based on physician discretion. Treatment approaches differ across centers due to evolving evidence and regional guidelines, as well as the clinician’s choices. We also acknowledge that retrospective treatment heterogeneity limits causal inferences about specific therapies. Analysis of combinational anti-PD-1 therapy was exploratory and hypothesis-generating, as it reflects a single-center experience with limited external comparators. The above finding is not a validation of anti-PD-1 efficacy but rather a signal to prioritize prospective investigation in high-risk groups. Although we had a relatively large sample size for this rare malignancy, larger international prospective studies and further prediction models of outcomes for specific therapies are still warranted. Besides, The median follow-up time in external validation cohort 2 was shorter than in the other cohorts. While this score integrates key biomarkers like Ki67 and EBV-DNA, our model does not include deeper molecular profiling data. Developing integrated models that combine multi-omics data represents a avenue for future research.
The ENKTL-ML score significantly enhanced prognostic accuracy, aiding in discriminating and optimizing clinical decision-making in the contemporary landscape of chemoradiotherapy and immunotherapy. These findings can support physicians in prognostic assessment of ENKTL at diagnosis.
Methods
Study design and subjects
We studied four cohorts of consecutive subjects with newly diagnosed ENKTL treated with non-anthracycline-based regimens with or without RT. The derivation cohort included subjects treated at the Sun Yat-sen University Cancer Center (SYSUCC) from January 2011 to December 2022 (partitioned into training and evaluation cohorts). The external validation cohorts were West China Hospital (external validation cohort 1, June 2013 to March 2022) and Sichuan Cancer Hospital (external validation cohort 2 August 2013 to October 2022). Diagnoses were meticulously confirmed by ≥2 experienced haemato-pathologists adhering to the 4th revised edition of the WHO hematopoietic and lymphoid tumor classification36. Subjects with palliative therapies, uncertain diagnoses, missing data, anthracycline recipients, or follow-up <6 months were excluded. Subjects who died within 6 months were not exclude. Ethics approval for this study was obtained from the following institutional Ethics Committees: the Ethics Committee of Sun Yat-sen University Cancer Center (Approval No. SL-B2022-024-01), the Ethics Committee of West China Hospital, Sichuan University (Approval No. WCHEC-2023-855), and the Ethics Committee of Sichuan Cancer Hospital & Institute. Informed consent was acquired from each patient prior to their enrollment. Data were performed in accordance with the Declaration of Helsinki. This study was reported adhered to TRIPOD guidelines37 (Table S11).
Primary outcomes
Overall survival (OS) was measured from diagnosis to death or last follow-up. Progression-free survival (PFS) spanned diagnosis to relapse, progression, death, or last follow-up. The last follow-up date was determined from institutional medical records via investigator case reports. Subjects alive but lost to follow-up were censored at their last confirmed contact date. Relapse/progression was defined as new or progressing lesions confirmed by local physical/imaging exams per Lugano 2014 criteria38. Central review was omitted due to retrospective design, but investigator expertise (all lymphoma specialists) supported local diagnoses. The first event determined progression-free survival in patients with multiple recurrences.
Baseline assessment
Two independent blinded researchers systematically extracted clinical, laboratory, and pathological co-variates from the medical records recorded at diagnosis. A 3rd blinded researcher resolved the discordance. We verified the values of each co-variate. The considered baseline co-variates at diagnosis comprised: (1) sex; (2) age; (3) Eastern Cooperative Oncology Group performance status (ECOG PS); (4) presence of B symptoms; (5) Ann Arbor stage; (6) bone marrow involvement; (7) regional lymph node involvement (specifically, the lymph node region corresponding to primary extranodal sites)5; (8) distant lymph node involvement (beyond regional lymph nodes)5; (9) number of extra-nodal sites involved; (10) primary tumor site; (11) primary tumor invasion (PTI, evaluated and defined according to a previous study)39; (12) serum lactic dehydrogenase (LDH) levels; (13) Ki67 proliferative index; (14) serum/plasma EBV-DNA copy number (expressed in copies/ml); Additionally, we also collected 2 treatment-related details of the include subjects: (1) primary treatment; (2) chemotherapy regimen. The determination of regional/distant lymph node involvement was based on computerized tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET) imaging, and/or biopsy5. Tumors originating from the upper aerodigestive tract (UADT) and non-UADT were classified based on their involvement in the nasal area40. Radiation therapy (RT) encompassed either an extended field or an extended involved field covering the primary tumor and adjacent areas and was administered at a median dose of 50 Gy (range, 41–70 Gy; dose per fraction, 1·8–2·0 Gy). Combinational anti-PD-1 therapy is the first-line administration of an anti-PD1 antibody combined with chidamide (histone deacetylase inhibitor) or P-GEMOX (pegaspargase, oxaliplatin, and gemcitabine) chemotherapy. If these combined regimens achieved a complete response (CR), anti-PD-1 monotherapy was continued for 2 years. If partial response (PR) was observed, the chemotherapy regimen was continued until disease progression or relapse occurred. In the event of progression, the chemotherapy regimen was adjusted, and treatment was continued until death from any cause or the last follow-up assessment. The choice of chemotherapy regimen was guided by the physician’s clinical experience, the patient’s individual characteristics, and their ability to tolerate treatment. Anti-PD-1 antibodies include sintilimab and tislelizumab.
Feature selection course
This course was performed in the training cohort. For the development of ML models, a comprehensive set of 14 demographic and clinicopathological features at diagnosis were utilized, as outlined in Table S3. To refine the model, feature selection is performed using the feature_importance of XGBoost. The outcome of XGboost model in was 5-year OS. The importance of each feature was evaluated using metrics (gain, weight, and cover). To identify the optimal subset of features, backward elimination was further employed. This process involves iteratively removing features with the least contribution to the model’s performance as evaluated by metrics based on the area under the receiver operating characteristic curve (AUROC) in XGBoost model, until the model achieves the best balance between predictive performance and feature reduction. To ensure the robustness of the selected features, 5-fold cross-validation was conducted during the backward elimination process.
Machine learning algorithm
The derivation cohort underwent random partitioning into two datasets: a training cohort (70%) designated for machine learning (ML) training and parameter optimization, and an evaluation cohort (30%) reserved for testing on previously unseen data. 16 ML algorithms were used to predict OS in the training cohort (Table S12). The prediction of 5-year OS was executed using 8 binary classifiers, and 8 algorithms were employed to process survival-censored data. The hyper-parameters were optimized through an exhaustive ten-fold cross-validation grid search. The discrimination capability of the algorithms in the evaluation cohort was evaluated using Harrell’s c-index and the receiver operating characteristic curve (ROC). The algorithm exhibiting the highest c-index and AUROC for the 5-year OS in the evaluation cohort was identified as the top performer and was thus selected for risk stratification. The resultant output of this top performer was subsequently utilized to predict the OS and PFS in ENKTL. The model was compared to other prediction models (IPI, KPI, NRI, PINK-E), with risk group definitions in Table S13. The SHapley Additive exPlanations (SHAP) method was employed to explain how the ML algorithm works41. A comprehensive depiction of the workflow is shown in Fig. S8.
Statistical analysis
Baseline data are presented as the mean ± standard deviation (for normally distributed continuous variables) or median (with IQR) for skewed data. Categorical variables are summarized as counts and percentages. Group comparisons were performed using independent t-tests or Mann–Whitney U tests for continuous variables and chi-square or Fisher’s exact tests for categorical variables. The performance of the ENKTL-ML score was evaluated using the c-index and time-dependent AUROC in the evaluation and validation cohorts, respectively. Calibration was assessed using calibration plots and decision curve analysis was used to compare the net benefits at different threshold probabilities to assess clinical utility42. Dual cut-offs for the ENKTL-ML score were set at 85% sensitivity (rule-out) and 95% specificity (rule-in) to predict the 5-year OS. Participants were categorized as low-risk (below rule-out), intermediate-risk (between cut-offs), or high-risk (above rule-in). To address potential selection bias in treatment assignment between subjects receiving combinational anti-PD-1 therapy and conventional treatment, we performed propensity score matching within each risk stratum. A logistic regression model was used to estimate propensity scores, representing the probability of receiving combinational anti-PD-1 therapy based on the baseline covariates. Within each risk stratum, subjects in the anti-PD-1 group were matched 1:1 to those in the conventional treatment group using nearest-neighbor matching with a caliper width of 0.2 standard deviations of the propensity score, without replacement. This method minimizes differences in baseline characteristics while retaining statistical power43. Post-matching balance was assessed via standardized mean differences (SMD), where an SMD < 0.1 indicated adequate balance. Sensitivity analyses confirmed robustness of the matching process. The matched cohorts were analyzed for survival outcomes using Cox proportional hazards models. R package ‘compareC’ were used to compare c-index of the models44. OS and PFS were estimated using Kaplan-Meier curves with log-rank tests for significance (two-sided P < 0.05). All analyses were performed using R software (version 4.3.2, https://cran.r-project.org/src/base/R-4/).
Data availability
Study data are available on the Research Data Deposit public platform (www.researchdata.org.cn). Data sharing request should be sent to the corresponding author.
Code availability
Source codes are available on https://highcloud.shinyapps.io/ENKTL_ML_Scores/, as well as the Research Data Deposit public platform (www.researchdata.org.cn).
References
Wang, H., Fu, B. B., Gale, R. P. & Liang, Y. NK-/T-cell lymphomas. Leukemia 35, 2460–2468 (2021).
Yamaguchi, M. et al. Phase II study of SMILE chemotherapy for newly diagnosed stage IV, relapsed, or refractory extranodal natural killer (NK)/T-cell lymphoma, nasal type: the NK-Cell Tumor Study Group study. J. Clin. Oncol. 29, 4410–4416 (2011).
International Non-Hodgkin's Lymphoma Prognostic Factors Project. A predictive model for aggressive non-Hodgkin’s lymphoma. N. Engl. J. Med. 329, 987–994 (1993).
Lee, J. et al. Extranodal natural killer T-cell lymphoma, nasal-type: a prognostic model from a retrospective multicenter study. J. Clin. Oncol. 24, 612–618 (2006).
Kim, S. J. et al. A prognostic index for natural killer cell lymphoma after non-anthracycline-based treatment: a multicentre, retrospective analysis. Lancet Oncol. 17, 389–400 (2016).
Yang, Y. et al. Prognostic nomogram for overall survival in previously untreated patients with extranodal NK/T-cell lymphoma, nasal-type: a multicenter study. Leukemia 29, 1571–1577 (2015).
Xiong, J. et al. Genomic and transcriptomic characterization of natural killer T Cell lymphoma. Cancer Cell 37, 403–419 (2020).
Tian, X. P. et al. A composite single-nucleotide polymorphism prediction signature for extranodal natural killer/T-cell lymphoma. Blood 138, 452–463 (2021).
Tian, X. P. et al. Diagnostic performance and prognostic value of circulating tumor DNA methylation marker in extranodal natural killer/T cell lymphoma. Cell Rep. Med. 4, 100859 (2023).
Swanson, K., Wu, E., Zhang, A., Alizadeh, A. A. & Zou, J. From patterns to patients: advances in clinical machine learning for cancer diagnosis, prognosis, and treatment. Cell 186, 1772–1791 (2023).
Lee, C. et al. Application of a novel machine learning framework for predicting non-metastatic prostate cancer-specific mortality in men using the Surveillance, Epidemiology, and End Results (SEER) database. Lancet Digit Health 3, e158–e165, https://doi.org/10.1016/s2589-7500(20)30314-9 (2021).
Kuenzi, B. M. et al. Predicting drug response and synergy using a deep learning model of human cancer cells. Cancer Cell 38, 672–684 (2020).
He, X., Gao, Y., Li, Z. & Huang, H. Review on natural killer/T-cell lymphoma. Hematol. Oncol. 41, 221–229 (2021).
Liu, X. et al. Risk-dependent conditional survival and failure hazard after radiotherapy for early-stage extranodal natural killer/T-cell lymphoma. JAMA Netw. Open 2, e190194, https://doi.org/10.1001/jamanetworkopen.2019.0194 (2019).
Yang, Y. et al. Risk-adapted therapy for early-stage extranodal nasal-type NK/T-cell lymphoma: analysis from a multicenter study. Blood 126, 1424–1432 (2015).
Sun, J. et al. New prognostic models for extranodal natural killer T-cell lymphoma, nasal-type using Cox regression and machine learning. Transl. Cancer Res. 10, 613–626 (2021).
Gerdes, J. et al. Prognostic relevance of tumour-cell growth fraction in malignant non-Hodgkin’s lymphomas. Lancet 2, 448–449 (1987).
Ellis, M. J. et al. Ki67 proliferation index as a tool for chemotherapy decisions during and after neoadjuvant aromatase inhibitor treatment of breast cancer: results from the american college of surgeons oncology group Z1031 trial (Alliance). J. Clin. Oncol. 35, 1061–1069 (2017).
Xu, B. et al. International medullary thyroid carcinoma grading system: a validated grading system for medullary thyroid carcinoma. J. Clin. Oncol. 40, 96–104 (2022).
Gimotty, P. A. et al. Biologic and prognostic significance of dermal Ki67 expression, mitoses, and tumorigenicity in thin invasive cutaneous melanoma. J. Clin. Oncol. 23, 8048–8056 (2005).
Hoster, E. et al. Prognostic value of Ki-67 Index, cytology, and growth pattern in mantle-cell lymphoma: results from randomized trials of the European Mantle Cell Lymphoma Network. J. Clin. Oncol. 34, 1386–1394 (2016).
Vose, J. M. et al. Integrative analysis of clinicopathological features defines novel prognostic models for mantle cell lymphoma in the immunochemotherapy era: a report from The North American Mantle Cell Lymphoma Consortium. J. Hematol. Oncol. 16, 122 (2023).
Huang, X. et al. Both c-Myc and Ki-67 expression are predictive markers in patients with extranodal NK/T-cell lymphoma, nasal type: a retrospective study in China. Pathol. Res Pract. 210, 351–35 (2014).
Jiang, L. et al. Prognostic significance of Ki-67 antigen expression in extranodal natural killer/T-cell lymphoma, nasal type. Med. Oncol. 31, 218 (2014).
Kim, S. J. et al. Ki-67 expression is predictive of prognosis in patients with stage I/II extranodal NK/T-cell lymphoma, nasal type. Ann. Oncol. 18, 1382–1387 (2007).
Yamaguchi, M. et al. Treatments and outcomes of patients with extranodal natural killer/T-cell lymphoma diagnosed between 2000 and 2013: a cooperative study in Japan. J. Clin. Oncol. 35, 32–39 (2017).
Jo, J. C. et al. Expression of programmed cell death 1 and programmed cell death ligand 1 in extranodal NK/T-cell lymphoma, nasal type. Ann. Hematol. 96, 25–31 (2017).
Kataoka, K. et al. Frequent structural variations involving programmed death ligands in Epstein-Barr virus-associated lymphomas. Leukemia 33, 1687–1699 (2019).
Nagato, T. et al. Programmed death-ligand 1 and its soluble form are highly expressed in nasal natural killer/T-cell lymphoma: a potential rationale for immunotherapy. Cancer Immunol. Immunother. 66, 877–890 (2017).
He, L., Chen, N., Dai, L. & Peng, X. Advances and challenges of immunotherapies in NK/T cell lymphomas. iScience 26, 108192 (2023).
Cai, J. et al. Combination of anti-PD-1 antibody with P-GEMOX as a potentially effective immunochemotherapy for advanced natural killer/T cell lymphoma. Signal Transduct. Target Ther. 5, 289 (2020).
Chen, Z. et al. Genomic features reveal potential benefit of adding anti-PD-1 immunotherapy to treat non-upper aerodigestive tract natural killer/T-cell lymphoma. Leukemia 38, 829–839 (2024).
Tian, X. P. et al. First-line sintilimab with pegaspargase, gemcitabine, and oxaliplatin in advanced extranodal natural killer/T cell lymphoma (SPIRIT): a multicentre, single-arm, phase 2 trial. Lancet Haematol 11, e336–e344, https://doi.org/10.1016/s2352-3026(24)00066-8 (2024).
Kwong, Y. L. et al. PD1 blockade with pembrolizumab is highly effective in relapsed or refractory NK/T-cell lymphoma failing l-asparaginase. Blood 129, 2437–2442 (2017).
Sun, P. et al. A phase II study of sintilimab, anlotinib, and pegaspargase sandwiched with radiotherapy as first-line therapy in patients with newly diagnosed, stage I-II extranodal natural-killer/T-cell lymphoma. Am. J. Hematol. 98, 1043–1051 (2023).
Swerdlow, S. H. et al. The 2016 revision of the World Health Organization classification of lymphoid neoplasms. Blood 127, 2375–2390 (2016).
Collins, G. S., Reitsma, J. B., Altman, D. G. & Moons, K. G. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Bmj 350, g7594 (2015).
Cheson, B. D. et al. Recommendations for initial evaluation, staging, and response assessment of Hodgkin and non-Hodgkin lymphoma: the Lugano classification. J. Clin. Oncol. 32, 3059–3068 (2014).
Qi, S. N. et al. Effect of primary tumor invasion on treatment and survival in extranodal nasal-type NK/T-cell lymphoma in the modern chemotherapy era: a multicenter study from the China Lymphoma Collaborative Group (CLCG). Leuk. Lymphoma 60, 2669–2678 (2019).
Suzuki, R. et al. Prognostic factors for mature natural killer (NK) cell neoplasms: aggressive NK cell leukemia and extranodal NK cell lymphoma, nasal type. Ann. Oncol. 21, 1032–1040 (2010).
Von Luxburg, U. et al. Proc. 31st Annual Conference on Neural Information Processing Systems (NIPS 2017) (Curran Associates Inc., 2017).
Kerr, K. F., Brown, M. D., Zhu, K. & Janes, H. Assessing the clinical impact of risk prediction models with decision curves: guidance for correct interpretation and appropriate use. J. Clin. Oncol. 34, 2534–2540 (2016).
Austin, P. C. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar. Behav. Res. 46, 399–424 (2011).
Kang, L., Chen, W., Petrick, N. A. & Gallas, B. D. Comparing two correlated C indices with right-censored survival outcome: a one-shot nonparametric approach. Stat. Med. 34, 685–703 (2015).
Acknowledgements
We appreciate Robort Peter Gale's English editing and polishing of this manuscript. This work was supported, in part, by National Natural Science Foundation of China (82003196, 82270198). Outstanding Young Scientific and Technological Talents Fund of Sichuan Province (2022JDJQ0059). National Natural Science Foundation of China (grant number: 82003196, 82270198). Outstanding Young Scientific and Technological Talents Fund of Sichuan Province (2022JDJQ0059).
Author information
Authors and Affiliations
Contributions
S.L., L.-M.G., H.-M.H., and Y.-J.Z. contributed equally as co-first authors. Y.-H.H., J.-P.Y., and Y.L. shared their last authorship. Y.-H.H. and J.-P.Y. designed the study. All authors contributed to data collection. S.L., Y.-J.Z., and Y.L. performed the statistical analyses. S.L., L.-M.G., H.-M.H., and Y.-J.Z. analyzed and interpreted the data. S.L., L.-M.G., Y.-J.Z., Y.-C.Z., J.-P.Y., and Y.-H.H. drafted the manuscript. All authors verified the underlying data and provided final approval of the submission. All authors have full access to all the data and had final responsibility for the decision to submit for publication.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Li, S., Gao, LM., Hong, HM. et al. Development and validation of a predictive model for extranodal natural killer/T-cell lymphoma. npj Digit. Med. 9, 134 (2026). https://doi.org/10.1038/s41746-025-02316-4
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41746-025-02316-4








