Abstract
Accurate prediction of lymph node metastasis (LNM) is critical for the staging and treatment planning of gastric cancer (GC). This study aimed to develop and validate a multi-module prediction model that integrates clinicopathological features and hematological biomarkers to enhance the preoperative assessment of GC-LNM. A retrospective analysis was conducted on GC patients treated at a single medical center. Clinical variables were categorized into five modules: basic demographic information, tumor characteristics, inflammation-related indicators, coagulation parameters, and nutritional-immune markers. An XGBoost machine learning model was constructed using 19 selected features, and model interpretability was assessed using SHapley Additive exPlanations (SHAP). Model performance was evaluated using the area under the curve (AUC), sensitivity, and specificity across training (80%) and testing (20%) cohorts. Among 1580 patients included in the analysis, 984 (62.3%) had confirmed LNM. The optimized XGBoost model demonstrated excellent predictive performance, achieving an AUC of 0.883 (95% CI 0.864–0.902) in the training set and 0.815 (95% CI 0.767–0.863) in the testing set. SHAP analysis revealed distinct biomarker contribution patterns across different T-stages, Lauren classifications, and histological differentiation grades. In multivariate logistic regression, T4 stage (odd ratio [OR] = 16.091, P < 0.001) and poorly differentiated tumors (OR = 5.891, P < 0.05) were confirmed as independent risk factors for LNM. This interpretable, multi-module machine learning model offers a robust and convenient tool for predicting LNM in GC, facilitating precise risk stratification and individualized treatment decision-making. The observed heterogeneity in biomarker predictive patterns across pathological subtypes also provides novel insights into metastatic mechanisms and supports the development of personalized therapeutic strategies.
Introduction
Gastric cancer (GC) represents a malignant tumor that poses a severe threat to human health worldwide. According to the GLOBOCAN statistics, over 968,000 new cases and nearly 660,000 deaths were recorded in 2022, with both incidence and mortality rates ranking fifth globally1. Among various prognostic factors, lymph node metastasis (LNM) status has been established as one of the most clinically significant independent prognostic indicators2,3. Accurate assessment of LNM serves not only as the cornerstone of TNM staging but also as a crucial foundation for developing individualized treatment strategies, predicting survival outcomes, and improving quality of life4. In assessing the risk of LNM in GC, current researches emphasize the combined evaluation of baseline patient characteristics and tumor-specific pathological features to improve stratification accuracy. Baseline factors such as age, sex, and general physical status may indirectly affect immune surveillance and tumor progression. In parallel, pathological indicators—including tumor invasion depth (T stage), maximum tumor diameter (MTD), histological subtype, and differentiation grade—have been widely recognized as direct predictors of LNM4,5,6. Tumors with deeper invasion, larger size, diffuse histology, or poor differentiation tend to exhibit higher metastatic potential6,7. Moreover, elevated levels of serum tumor markers, such as carcinoembryonic antigen (CEA), are strongly associated with LNM risk and offer potential utility for non-invasive prediction8.
GC is recognized as an inflammation-related disease, with inflammatory factors playing crucial roles throughout its development and progression, potentially contributing to the LNM process. Published evidences have confirmed that various inflammatory indicators serve as significant predictors of LNM risk9,10. Basic inflammatory ratios including neutrophil-to-lymphocyte ratio (NLR), platelet-to-lymphocyte ratio (PLR), and monocyte-to-lymphocyte ratio (MLR) can effectively reflect immune-inflammatory balance status, with NLR >1.5 being confirmed as an independent predictive factor in multiple studies11,12. Besides, composite indicators integrating multiple inflammatory parameters, such as systemic inflammatory response index (SIRI), systemic immune-inflammation index (SII), and aggregate index of systemic inflammation (AISI), have demonstrated superior predictive performance and have been validated in multicenter studies13,14. Coagulation function and nutritional status have gained increasing attention in LNM research15. Coagulation indicators such as fibrinogen (FIB) and prothrombin time (PT) participate in tumor cell adhesion and metastasis beyond their hemostatic roles. Nutritional and immune indicators including platelet-to-albumin ratio (PAR)16,17, red cell distribution width-to-albumin ratio (RAR)18, prognostic nutritional index (PNI)19, and hemoglobin-to-red cell distribution width ratio (HRR) have been validated as effective prognostic markers in various diseases20. These coagulation and nutritional parameters show certain associations with LNM risk in GC, though systematic validation remains needed.
Compared to single indicators, multi-module comprehensive indices show greater potential in systematically and holistically capturing patients’ pathophysiological states, thereby enhancing the predictive accuracy for LNM in gastric cancer. In this study, the integration of five key modules—encompassing demographic characteristics, tumor biological behavior, systemic immune-inflammatory status, metabolic and coagulation function, as well as nutritional and immune competence—enabled the construction of a more robust and precise predictive model. This multidimensional data fusion approach provides a novel and effective strategy for realizing individualized treatment in clinical practice. By enabling the preoperative identification of high-risk patients for LNM, the model supports more informed surgical planning, particularly in determining the extent of lymph node dissection. High-risk individuals may benefit from neoadjuvant therapies or more aggressive surgical interventions, while low-risk patients can avoid overtreatment by opting for more conservative approaches, thereby minimizing treatment-related complications and improving overall outcomes.
Methods
Ethics statement
This study was approved by the Ethics Committee of Lanzhou University Second Hospital (approval number: 2025 A-594). As this was a retrospective study without direct patient interaction, patient names were anonymized and no personal identifying information was involved, informed consent was therefore waived. This study was conducted in strict accordance with the Declaration of Helsinki and its subsequent amendments.
Study population
Clinical data were retrospectively extracted from the electronic medical record system of Lanzhou University Second Hospital. A total of 1,580 consecutive GC patients diagnosed between January 2013 and December 2023 were included after applying rigorous inclusion and exclusion criteria to ensure data integrity and study validity. Inclusion criteria: (1) Patients with pathologically confirmed GC who underwent radical gastrectomy with R0 resection; (2) Histological types including adenocarcinoma, signet ring cell carcinoma, or mucinous adenocarcinoma; (3) No distant metastasis; (4) Complete clinical data including demographic information (age, sex), laboratory examinations (blood count and coagulation parameters), CEA, pathological staging, tumor differentiation grade, tumor location, and MTD. All patients were staged according to the 8th edition of the American Joint Committee on Cancer (AJCC)/Union for International Cancer Control (UICC) TNM staging system21. Exclusion criteria: (1) Pathological diagnosis of gastrointestinal stromal tumor, neuroendocrine tumor, or benign gastric tumor; (2) Patients who received preoperative chemotherapy or radiotherapy; (3) Patients with severe cardiopulmonary diseases or liver cirrhosis; (4) Patients with concurrent malignancies; (5) Patients who received anticoagulant therapy within 6 months; (6) Patients with incomplete clinical data.
Data variables
The collected data were organized into five main modules: Basic information module: contains patient demographic information including age and gender; Tumor characteristics module: T stage, Lauren classification (histological subtype), tumor location (anatomical site within the stomach), differentiation grade, MTD, and CEA levels; Inflammation-related indicators module22,23: primarily comprises comprehensive inflammatory indices calculated from peripheral blood parameters, including PLR, NLR, SII, etc., with specific calculation formulas shown in Table 1; Coagulation parameters module: mainly includes FIB and PT; Nutritional and immune indicators module: primarily consists of RAR, PAR, and HRR24, with calculation formulas shown in Table 1.
Statistical analysis
Data quality control and baseline characteristics comparison
Data quality control and statistical analyses were performed using R software (version 4.2.3). Patients were divided into two groups based on LNM status (N-stage): no LNM group (N0 stage, status = 0) and LNM group (N1-3 stages, status = 1). Categorical variables (gender, tumor location, Lauren classification, tumor differentiation grade, T stage) were converted to factor variables. Missing data mechanisms were analyzed using the “missana” R package (https://github.com/xiaoqqjun/missana, version 0.0.9), and multiple imputation was employed to impute missing values to ensure data completeness. For factor variables, reference levels were uniformly set: female as reference for gender, upper stomach as reference for tumor location, intestinal type as reference for Lauren classification, well-differentiated as reference for differentiation grade, and T-stage was set as an ordered factor (T1-T4). Distribution characteristics of continuous variables were evaluated using the distribution “transformR” R packages (https://github.com/xiaoqqjun/transformR, version 0.1.0), with normality assessed by the Shapiro-Wilk test. For variables not conforming to normal distribution, the best_transform function was used to automatically select optimal transformation methods (primarily including logarithmic transformation, square root transformation, Box-Cox transformation, rank transformation, etc.) to improve normality and homogeneity of variance. Notably, continuous variables in the current analysis did not conform to normal distribution; therefore, the Wilcoxon rank-sum test (Mann-Whitney U test) was employed for between-group comparisons, with data presented as median and interquartile range [P25, P75]. Categorical variables were presented as frequency and percentage n (%), with between-group differences assessed using chi-square tests; Fisher’s exact test was used when expected frequencies were less than 5. All statistical tests were two-sided, with P < 0.05 considered statistically significant.
Univariate association analysis
To identify potential risk factors associated with LNM, univariate logistic regression analysis was performed on all collected variables. For continuous variables (17 items in total, including age, CEA, PT, FIB, PLR, PNI, NLR, PAR, RAR, SIRI, SII, NPR, dNLR, MLR, HRR, AISI, and MTD), logistic regression models were constructed separately to calculate odds ratios (OR) and their 95% confidence intervals (CI), evaluating their associations with LNM. For categorical variables (including gender, tumor location, Lauren classification, differentiation grade, and T-stage), chi-square tests (Fisher’s exact test) were first employed to assess distributional differences between different levels, followed by logistic regression analysis to calculate OR values and 95% CIs for each category relative to the reference level.
Multicollinearity diagnosis and variable selection
Prior to multivariate logistic regression analysis, multicollinearity analysis was conducted on variables with statistical significance in univariate logistic regression analysis. Pearson correlation analysis was first used to examine correlations between continuous variables, identifying significantly correlated variable pairs (|r| >0.6). Subsequently, variance inflation factors (VIF) were calculated to quantify the degree of multicollinearity, where VIF < 5 indicates no obvious collinearity problems, 5 ≤ VIF < 10 indicates moderate collinearity requiring attention, and VIF ≥ 10 indicates severe collinearity suggesting variable removal25,26. For variables with severe collinearity, we prioritized retaining variables with greater clinical significance or higher testing efficacy.
Machine learning modeling and evaluation
An eXtreme Gradient Boosting (XGBoost) algorithm was employed to construct the predictive model27,28. The data were randomly divided into training and testing sets at an 8:2 ratio, ensuring consistent target variable distribution between groups. Categorical variables underwent dummy encoding to generate numerical feature matrices suitable for machine learning algorithms. XGBoost parameters included: learning rate (eta = 0.03), maximum depth (max_depth = 3), minimum child weight (min_child_weight = 4), subsample ratio (subsample = 0.6), feature sampling ratio (colsample_bytree = 0.6), etc., with L1 (alpha = 0.5) and L2 (lambda = 2) regularization to prevent overfitting. The scale_pos_weight parameter was set according to class imbalance to balance positive and negative samples. Early stopping (early_stopping_rounds = 20) and cross-validation were employed to avoid overtraining. Subsequently, model performance was evaluated using multiple metrics. The area under the receiver operating characteristic curve (AUC-ROC) and its 95% confidence interval were calculated to assess model discriminative ability. Sensitivity, specificity, and accuracy were calculated through confusion matrices29,30. The Youden index was used to determine optimal classification thresholds, balancing sensitivity and specificity31,32. ROC curves were plotted for both the training and test sets to assess model generalization capability33. In addition, calibration, Precision–Recall (PR), and Decision Curve Analysis (DCA) were performed to further evaluate model reliability and clinical usefulness. Calibration curves were generated for both the training and test sets to assess the agreement between predicted and observed outcomes. PR curves and the corresponding area under the curve (PR-AUC) were calculated to evaluate model performance under class imbalance, while DCA quantified the net clinical benefit of using the model across a range of threshold probabilities.
Feature importance and interpretability analysis
Feature importance was evaluated using XGBoost’s built-in assessment methods, calculating importance scores for each feature based on information gain, cover, and frequency. To enhance interpretability, SHapley Additive exPlanations (SHAP) analysis was conducted to quantify each feature’s contribution to individual prediction outcomes34,35. Through various visualization methods including SHAP beeswarm plots and force plots, the model’s predictive logic was explained from both global and local perspectives. The influence patterns of different clinical features.
Model optimization and validation
To determine the optimal model configuration, models of varying complexity were constructed by progressively increasing the number of features, with AUC performance compared across training and testing sets for each model. The trade-off relationship between model complexity and performance was evaluated to identify the optimal feature combination, while overfitting risk was assessed by comparing AUC differences between training and testing sets. Following complexity optimization, the best-performing features identified by XGBoost were used to construct a multivariate logistic regression model for validation. Using the same 8:2 data splitting strategy, comprehensive model diagnostics were performed, including AIC and BIC assessment. Model performance was evaluated through ROC curve analysis, calculating AUC and 95% confidence intervals for both datasets. The Youden index determined optimal classification thresholds, with sensitivity, specificity, and predictive values calculated accordingly. To further understand feature-outcome relationships, Restricted Cubic Splines (RCS) analysis with four knots was applied to key continuous variables, visualizing prediction probability curves to identify optimal threshold points and nonlinear effect patterns36,37.
Result
Baseline characteristics
The study included 1580 GC patients, among whom 596 cases (37.7%) had no LNM and 984 cases (62.3%) exhibited varying degrees of LNM (N1: 217 cases [22.1%], N2: 303 cases [30.8%]; N3: 464 cases [47.2%]).There were no statistically significant differences between the LNM and non-LNM groups in age, sex, or tumor location (Table 2, P > 0.05). However, tumor-related characteristics, including CEA levels, MTD, Lauren classification, differentiation grade, and T-stage, showed significant differences (P < 0.05). Patients in the LNM group presented with higher CEA levels, larger tumor size, and more advanced T stages. Moreover, diffuse-type histology (50%) and poorly differentiated tumors (74.9%) were more frequently associated with LNM. Regarding coagulation indicators, the median FIB level was significantly higher in the LNM group (3.28 g/L) than in the non-LNM group (2.90 g/L, P < 0.001). In terms of nutritional and immune parameters, the PNI was markedly lower in the LNM group (median = 47.95, P < 0.001). Conversely, PAR and RAR were significantly elevated in the LNM group (median PAR = 54.648 vs. 48.096; median RAR = 3.363 vs. 3.189; both P < 0.001). The HRR was significantly lower in the LNM group (1.024 vs. 1.107; P < 0.001). As for inflammatory indicators, the LNM group exhibited consistently elevated levels: SIRI (0.845 vs. 0.736, P < 0.001), SII (533.385 vs. 415.295, P < 0.001), NLR (2.341 vs. 2.139, P < 0.001), dNLR (1.751 vs. 1.627, P = 0.001), MLR (0.241 vs. 0.228, P < 0.001), AISI (190.343 vs. 135.987, P < 0.001), and PLR (150.188 vs. 126.049, P < 0.001). Comprehensive analysis indicates that GC patients with LNM displayed distinct pathophysiological alterations, characterized by: Higher tumor burden (elevated CEA, increased tumor diameter, advanced T-stage); Enhanced systemic inflammation (markedly elevated inflammatory indicators); Poor nutritional status (lower PNI); and Pronounced immune dysfunction (abnormal immune-inflammatory ratios). Collectively, these findings highlight significant biological differences between metastatic and non-metastatic GC, providing a solid foundation for the subsequent development of LNM prediction models.
Univariate logistic regression analysis
Univariate logistic regression analysis results (Fig. 1, Table S1) showed that multiple indicators were significant risk factors for LNM (OR > 1, P < 0.05). Among these, MTD, differentiation grade, and T-stage demonstrated higher OR values. Inflammation-related indicators, including SII, PLR, and AISI, were also positively correlated with LNM, highlighting the crucial role of the inflammatory microenvironment in metastatic progression. Additionally, Lauren classification, FIB, CEA, PAR, and nutritional indicators were significantly associated with LNM. In contrast, HRR acted as a protective factor (OR < 1, P < 0.05). The CI for NLR was relatively wide, suggesting its predictive value requires further validation. Baseline characteristics such as age, sex, and tumor location showed no statistical significance.
Forest plot of univariate logistic regression analysis. Red indicates variables with significant associations in univariate analysis, while gray indicates non-significant variables. Dots represent odds ratios (OR), and horizontal lines through the dots represent 95% confidence intervals. MTD, maximum tumor diameter; SII, systemic immune-inflammation index; RAR, red blood cell distribution width-to-albumin ratio; PNI, prognostic nutritional index; PLR, platelet-to-lymphocyte ratio; PAR, platelet-to-albumin ratio; Lauren type, Lauren histological classification; HRR, hemoglobin-to-red blood cell distribution width ratio; FIB, fibrinogen; CEA, carcinoembryonic antigen; AISI, aggregate index of systemic inflammation; NLR, neutrophil-to-lymphocyte ratio; MLR, monocyte-to-lymphocyte ratio; SIRI, systemic inflammatory response index; dNLR, derived neutrophil-to-lymphocyte ratio; PT, prothrombin time; NPR, neutrophil-to-platelet ratio.
Variable correlation and collinearity analysis
Correlation analysis of continuous variables (Fig. 2A) revealed several strong positive correlations (r > 0.75), including SIRI and MLR (r = 0.79), SIRI and AISI (r = 0.86), AISI and SII (r = 0.75), SII and PLR (r = 0.80), SII and NLR (dNLR) (r = 0.84), and NLR and dNLR (r = 0.96). Such high intercorrelations among variables may lead to model overfitting and reduced statistical robustness. VIF analysis (Fig. 2B) further supported these findings, identifying NLR (6.96), SIRI (5.95), SII (5.47), and dNLR (5.25) as having VIF values greater than 5, suggesting substantial multicollinearity. AISI (4.97) was slightly below this threshold, whereas other variables exhibited relatively lower VIF values, indicating milder collinearity. The results of the VIF analysis were highly consistent with the correlation matrix, confirming the presence of severe multicollinearity among inflammation-related indicators. These findings provided preliminary evidence for feature selection in subsequent modeling steps. To further evaluate the predictive value of individual inflammatory indicators, eight composite inflammatory indices (PLR, SII, AISI, SIRI, NLR, MLR, dNLR, and NPR) were individually modeled against the LNM outcome, and AUC values were calculated (Table S2). The results showed that PLR (AUC = 0.604), SII (AUC = 0.597), and AISI (AUC = 0.593) achieved the highest discriminatory performance, while SIRI (AUC = 0.557), NLR (AUC = 0.557), MLR (AUC = 0.555), and dNLR (AUC = 0.550) demonstrated moderate predictive ability. NPR (AUC = 0.461) performed poorest, with an AUC below 0.5. Accordingly, PLR, SII, and AISI were selected as the key representative variables of systemic inflammatory indices for inclusion in subsequent comprehensive modeling and analysis.
Correlation analysis of continuous variables and variance inflation factor (VIF) analysis of all variables. (A) spearman correlation analysis results between continuous variables, where red indicates positive correlation and blue indicates negative correlation. (B) VIF analysis results, where red indicates potential multicollinearity, yellow indicates mild multicollinearity, and green indicates no multicollinearity. MTD, maximum tumor diameter; SII, systemic immune-inflammation index; RAR, red blood cell distribution width-to-albumin ratio; PNI, prognostic nutritional index; PLR, platelet-to-lymphocyte ratio; PAR, platelet-to-albumin ratio; Lauren type, Lauren histological classification; HRR, hemoglobin-to-red blood cell distribution width ratio; FIB, fibrinogen; CEA, carcinoembryonic antigen; AISI, aggregate index of systemic inflammation; NLR, neutrophil-to-lymphocyte ratio; MLR, monocyte-to-lymphocyte ratio; SIRI, systemic inflammatory response index; dNLR, derived neutrophil-to-lymphocyte ratio; PT, prothrombin time; NPR, neutrophil-to-platelet ratio.
XGBoost model feature importance and interpretability analysis
An XGBoost-based predictive model was constructed using 17 clinicopathological variables: T stage, MTD, PLR, PAR, CEA, SII, FIB, AISI, HRR, age, RAR, PNI, PT, differentiation grade, Lauren classification, tumor location, and gender.ROC analysis (Fig. 3A) demonstrated excellent discriminative ability, with an AUC of 0.8833 (95% CI 0.8641–0.9024) in the training set and 0.815 (95% CI 0.7674–0.8626) in the testing set, indicating robust overall performance. Confusion matrix analysis (Fig. 3B) further validated the classification results. In the training set, correctly classified cases (TP = 704, TN = 322) markedly outnumbered misclassified ones (FP = 132, FN = 106), achieving an overall accuracy of 80.1%. In the testing set, correctly predicted positive cases remained stable (TP = 145), but false negatives increased (FN = 53), suggesting a mild decline in prediction stability on independent data. Sensitivity–specificity analysis (Fig. 3C) showed that in the training set, the model achieved high specificity (0.89) and moderate sensitivity (0.747), reflecting strong ability to exclude non-metastatic cases while maintaining good sensitivity for positive detection. However, both metrics declined in the testing set (sensitivity = 0.634, specificity = 0.822), implying partial overfitting. The relatively lower sensitivity may lead to missed positive cases, indicating the need for further optimization through refined feature selection, hyperparameter tuning, or enhanced regularization.
XGBoost model performance evaluation and feature importance analysis. (A) receiver operating characteristic (ROC) curves for the training and testing datasets with optimal cutoff value of 0.44. (B) confusion matrices showing classification performance on training set (left) and test set (right). (C) model performance metrics comparing sensitivity and specificity between training and test datasets. (D) SHAP (SHapley Additive exPlanations) value analysis showing feature importance and contribution to lymph node metastasis (LNM) prediction, with features ranked by importance from top to bottom. (E) Calibration plot for the training and test sets. The blue and red solid lines represent the calibration curves of the model in the training and test sets, respectively. The dashed line indicates the ideal reference line (perfect agreement between predicted and observed outcomes). The dots represent the mean predicted probability and observed incidence after dividing the samples into 10 groups according to predicted risk. In the training set, the 10 data points were evenly distributed with predicted probabilities ranging from 0.08 to 0.83 and observed incidences from 0.06 to 0.98; in the test set, predicted probabilities ranged from 0.08 to 0.82 and observed incidences from 0.03 to 0.97. (F) Precision–Recall (PR) curve. The green solid line represents the relationship between precision and recall at different thresholds, the red dots indicate the threshold points, and the red horizontal line denotes the no-skill classifier baseline. The PR-AUC was 0.8413, demonstrating excellent discriminative performance of the model even under class imbalance (positive rate = 62%). (G) Decision curve analysis (DCA). The x-axis represents the threshold probability, and the y-axis represents the net benefit. The blue solid line indicates the net benefit of the model, while the purple and yellow dashed lines correspond to the “treat-all” and “treat-none” strategies, respectively. MTD, maximum tumor diameter; SII, systemic immune-inflammation index; RAR, red blood cell distribution width-to-albumin ratio; PNI, prognostic nutritional index; PLR, platelet-to-lymphocyte ratio; PAR, platelet-to-albumin ratio; Lauren type, Lauren histological classification; HRR, hemoglobin-to-red blood cell distribution width ratio; FIB, fibrinogen; CEA, carcinoembryonic antigen; AISI, aggregate index of systemic inflammation; NLR, neutrophil-to-lymphocyte ratio; MLR, monocyte-to-lymphocyte ratio; SIRI, systemic inflammatory response index; dNLR, derived neutrophil-to-lymphocyte ratio; PT, prothrombin time; NPR, neutrophil-to-platelet ratio.
To evaluate feature contributions, feature importance was ranked using XGBoost’s built-in scoring (Table S3). The top ten predictive variables were T1, MTD, T2, PLR, T4, PAR, T3, CEA, SII, and FIB, with T stage showing the highest overall importance. Subsequent SHAP-based interpretability analysis (Fig. 3D) identified T stage as the pivotal determinant of model predictions. T1 and T2 stages exhibited negative SHAP values, implying limited contribution to LNM risk, whereas T3 and T4 stages yielded strongly positive SHAP values, indicating a marked increase in metastatic probability with deeper invasion. MTD displayed a nonlinear, bidirectional relationship with LNM risk: lower MTD values correlated negatively with LNM likelihood, but beyond a certain threshold, SHAP values became positive, suggesting that larger tumors substantially elevate metastatic risk. Similarly, higher PLR, FIB, HRR, AISI, and CEA values generated positive SHAP values, indicating their potential as predictive biomarkers of elevated metastatic risk. Interestingly, extremely high SII values produced negative SHAP effects, suggesting that the pro-metastatic influence of systemic inflammation may attenuate or reverse under excessive inflammatory burden, possibly due to immune exhaustion or compensatory regulation. Among pathological subtype variables, diffuse-type and poorly differentiated samples consistently showed positive SHAP values, whereas intestinal-type and well-differentiated tumors exhibited negative values, consistent with their distinct biological aggressiveness and metastatic tendency. In contrast, tumor location, gender, and moderately differentiated grade showed SHAP values clustered near zero, indicating limited predictive contribution in this model.
The predictive performance and clinical utility of the model were further evaluated using calibration, PR, and DCA plots (Fig. 3E–G). For the calibration plot (Fig. 3E), in the training set, the 10 data points were evenly distributed, with predicted probabilities ranging from 0.08 to 0.83 and observed incidences from 0.06 to 0.98; in the test set, predicted probabilities ranged from 0.08 to 0.82 and observed incidences from 0.03 to 0.97. In the PR curve (Fig. 3F), the PR-AUC was 0.8413, indicating excellent discriminative performance even under class imbalance (positive rate = 62%). The DCA curve (Fig. 3G) shows that the model (blue solid line) achieved a higher net benefit than either the “treat-all” or “treat-none” strategies across most clinically relevant threshold probabilities, suggesting favorable clinical applicability and decision-making value.
T-stage stratified SHAP dependency analysis
To deeply analyze the predictive contribution patterns of various biomarkers across different T-stages, we constructed stratified SHAP dependency plots (Fig. 4A–F), displaying SHAP value distribution characteristics of major features in T1-T4 stage patients respectively. Results showed that MTD (Fig. 4A1-A4) exhibited significantly differentiated influence patterns across different T-stages: in T1 stage patients (Fig. 4A1), high MTD values mainly produced positive SHAP values, while this positive association was more pronounced in T4 stage patients (Fig. 4A4), suggesting that MTD’s prognostic predictive value is more prominent in advanced GC patients. PAR (Fig. 4B1–B4) displayed nonlinear, bidirectional relationships across all stages, with SHAP values fluctuating between positive and negative as PAR increased; this pattern remained relatively consistent across T stages, indicating a stable yet complex contribution of PAR to LNM prediction. Similarly, PLR (Fig. 4C1–C4), CEA (Fig. 4D1–D4), and FIB (Fig. 4E1–E4) demonstrated stronger positive predictive impacts in T3 and T4 patients, particularly for CEA, whose high-value samples were predominantly located in regions of positive SHAP values—consistent with its well-established role as a tumor biomarker in advanced GC. The systemic inflammatory marker SII (Fig. 4F1-F4) showed relatively stable predictive patterns across all T-stages, though its effect magnitude was modestly enhanced in T3 (Fig. 4F3) and T4 stages (Fig. 4F4). These stratified SHAP analyses emphasize that biomarker influence is stage-dependent, with tumor burden and inflammatory indices contributing more strongly to metastatic prediction in advanced disease.
SHAP dependence analysis stratified by T stage. (A1–A4) maximum tumor diameter (MTD) dependence plots for T1–T4 stages. (B1–B4) platelet-to-albumin ratio (PAR) dependence plots for T1–T4 stages. (C1–C4) platelet-to-lymphocyte ratio (PLR) dependence plots for T1–T4 stages. (D1–D4) carcinoembryonic antigen (CEA) dependence plots for T1–T4 stages. (E1–E4) fibrinogen (FIB) dependence plots for T1–T4 stages. (F1–F4) systemic immune-inflammation index (SII) dependence plots for T1–T4 stages. Color coding represents SHAP values. Each plot demonstrates how feature values influence SHAP values (model predictions) within specific T stage groups.
Lauren classification stratified SHAP dependency analysis
To further explore the influence of histological heterogeneity on biomarker predictive behavior, SHAP dependency plots stratified by Lauren classification were generated (Fig. 5A–F), depicting SHAP value distributions for major features among intestinal-type, mixed-type, and diffuse-type GC patients. MTD (Fig. 5A1–A3) displayed distinctly different contribution patterns across subtypes. In intestinal-type GC (Fig. 5A1), higher MTD values were predominantly associated with positive SHAP effects, indicating stronger predictive importance, whereas in diffuse-type GC (Fig. 5A3) this positive association was attenuated, suggesting that tumor size exerts greater prognostic influence in intestinal-type GC. PAR (Fig. 5B1–B3) exhibited nonlinear, bidirectional SHAP fluctuations across all subtypes, showing broadly similar distributional patterns without notable subtype-specific deviation. In contrast, PLR (Fig. 5C1–C3) demonstrated marked subtype dependence: in intestinal-type GC, elevated PLR values corresponded mainly to positive SHAP regions, reflecting a favorable predictive contribution, and this association was further strengthened in diffuse-type GC, implying an enhanced prognostic role of PLR in this histological subtype. For CEA (Fig. 5D1–D3), the strongest positive SHAP contributions were observed in intestinal-type GC, where high CEA values clustered in the positive SHAP region. Conversely, its predictive relevance declined markedly in diffuse-type GC, likely reflecting intrinsic biological and molecular differences between subtypes. Both FIB (Fig. 5E1–E3) and SII (Fig. 5F1–F3) showed relatively consistent SHAP patterns across Lauren classifications, although their predictive effects were modestly enhanced in intestinal-type GC. These subtype-stratified SHAP analyses reveal that the predictive importance of several biomarkers, particularly MTD, PLR, and CEA, varies according to Lauren classification, underscoring the histological context dependence of metastatic risk determinants in GC.
SHAP dependence analysis stratified by Lauren histological classification. (A1–A3) maximum tumor diameter (MTD) dependence plots for Intestinal, Mixed, and Diffuse types. (B1–B3) platelet-to-albumin ratio (PAR) dependence plots across Lauren types. (C1–C3) platelet-to-lymphocyte ratio (PLR) dependence plots across Lauren types. (D1–D3), carcinoembryonic antigen (CEA) dependence plots across Lauren types. (E1–E3), fibrinogen (FIB) dependence plots across Lauren types. (F1–F3) systemic immune-inflammation index (SII) dependence plots across Lauren types. Color coding represents SHAP values. Each plot demonstrates how feature values influence SHAP values (model predictions) within specific histological subtypes.
Differentiation grade stratified SHAP dependency analysis
To evaluate the effect of tumor differentiation on the predictive behavior of key biomarkers, SHAP dependency plots stratified by histological grade were generated (Fig. 6A–F), illustrating SHAP value distributions for major features among well-, moderately-, and poorly differentiated GC cases. MTD (Fig. 6A1–A3) exhibited distinct predictive patterns across differentiation grades. In well-differentiated GC (Fig. 6A1), higher MTD values corresponded to minimal or even negative SHAP effects, indicating limited prognostic relevance. In contrast, in poorly differentiated GC (Fig. 6A3), elevated MTD values were associated with strongly positive SHAP values, reflecting a substantial contribution to the prediction model. These results suggest that tumor size exerts greater prognostic influence in more aggressive, poorly differentiated subtypes. PAR (Fig. 6B1–B3) demonstrated a nonlinear, complex contribution pattern across all differentiation grades. The SHAP value distribution was broader and more positive in poorly differentiated GC (Fig. 6B3), indicating increased predictive relevance of PAR in these patients. Similarly, PLR (Fig. 6C1–C3) displayed grade-dependent variation, exerting little effect in well-differentiated tumors but showing strong positive SHAP contributions in poorly differentiated GC, highlighting its enhanced predictive utility in advanced histologic grades. For CEA (Fig. 6D1–D3), the strongest predictive contribution was again observed in poorly differentiated GC, where elevated CEA levels clustered within positive SHAP regions—consistent with its recognized association with tumor invasiveness and metastatic potential. Both FIB (Fig. 6E1–E3) and SII (Fig. 6F1–F3) followed similar trends, showing greater predictive strength in poorly differentiated tumors than in well-differentiated ones. These findings underscore a tight interplay between tumor differentiation status and systemic inflammatory activity, reinforcing the biological significance of inflammation-related biomarkers in predicting lymph-node metastasis within aggressive GC subtypes.
SHAP dependence analysis stratified by histological differentiation grade. (A1–A3) maximum tumor diameter (MTD) dependence plots for well, moderately, and poorly differentiated grades. (B1–B3) platelet-to-albumin ratio (PAR) dependence plots across differentiation grades. (C1–C3) platelet-to-lymphocyte ratio (PLR) dependence plots across differentiation grades. (D1–D3) carcinoembryonic antigen (CEA) dependence plots across differentiation grades. (E1–E3) fibrinogen (FIB) dependence plots across differentiation grades. (F1–F3) systemic immune-inflammation index (SII) dependence plots across differentiation grades. Color coding represents SHAP values. Each plot demonstrates how feature values influence SHAP values (model predictions) within specific differentiation grades.
Individual prediction interpretation analysis based on SHAP
To elucidate the XGBoost model’s decision process at the individual level, four representative patient cases were analyzed using SHAP individual interpretation (Fig. 7). Waterfall plots depicted the additive contributions of each variable to the final prediction, starting from the model’s baseline expectation value (E[f(x)] = − 0.0252) and summing individual SHAP values to yield the prediction outcome f(x). In the first case (Fig. 7A, f(x) = − 1.36), T1 exerted the largest negative contribution (− 1.56), significantly reducing the predicted probability of poor prognosis, while MTD (+ 0.233) and PLR (+ 0.144) produced moderate positive contributions, ultimately predicting this patient as low risk. In the second case (Fig. 7B, f(x) = − 0.428), the negative contribution of T2 stage (− 0.604) was partly counterbalanced by high MTD (+ 0.334) and several inflammatory indicators with positive SHAP values, yielding an intermediate-risk profile. The third (Fig. 7C, f(x) = 1.33) and fourth (Fig. 7D, f(x) = 1.31) cases were both predicted as high risk, where synergistic positive SHAP effects from MTD, advanced T stage (T3/T4), and PLR collectively pushed predictions beyond the classification threshold. This individualized interpretability analysis confirms the clinical rationality and transparency of the XGBoost model, while offering a visual, patient-specific explanation of risk-driving factors that may assist clinicians in developing personalized surveillance and treatment strategies.
SHAP waterfall plots showing individual patient predictions with feature contributions. A-D, Four representative cases demonstrating how different clinical and laboratory features contribute to lymph node metastasis (LNM) prediction. Each horizontal bar represents a feature’s contribution to the final prediction, with red bars indicating positive contributions (increasing lymph node metastasis risk) and blue bars indicating negative contributions (decreasing risk). The x-axis shows the prediction value, starting from the expected value (E[f(x)]) and ending at the final prediction f(x). Feature values are displayed next to each bar. A, Patient with T1 stage shows f(x) = − 1.36 and indicates low-risk prediction. B, Patient with T2 stage shows f(x) = − 0.428 and indicates low-risk prediction. C, Patient with T3 stage shows f(x) = 1.33 and indicates high-risk prediction. D, Patient with T4 stage shows f(x) = 1.31 and indicates high-risk prediction. MTD, maximum tumor diameter; SII, systemic immune-inflammation index; PLR, platelet-to-lymphocyte ratio; PAR, platelet-to-albumin ratio; Lauren type, Lauren histological classification (poorly); FIB, fibrinogen; CEA, carcinoembryonic antigen.
Model performance evaluation and feature optimization analysis
To assess the model’s probability distribution and classification boundary, prediction probability histograms were generated according to actual labels (Fig. 8A). The results demonstrated clear categorical separation: negative samples (blue) were primarily concentrated in low probability ranges (0.0–0.3), with approximately 70% below the 0.5 decision threshold, indicating effective identification of low-risk cases; positive samples (red) were enriched in high probability ranges (0.6–0.9), with roughly 80% exceeding 0.5, reflecting strong detection capability for high-risk patients. To further examine the effect of feature quantity on model performance, recursive feature elimination (RFE) was performed, constructing XGBoost models with 7, 13, 19, and 27 features. As feature number increased, training-set AUC improved from 0.8492 to 0.8833, while testing-set AUC reached its peak at 0.8225 with 19 features (Fig. 8B). Comparison of training–testing AUC differences revealed that the 19-feature model achieved the lowest overfitting degree (difference = 0.0487; Fig. 8C), confirming its optimal balance between complexity and generalization.
Model validation and feature selection analysis. (A) distribution of predicted probabilities for lymph node metastasis (LNM), showing actual outcomes (blue: no LNM, red: LNM). The vertical dashed line indicates the optimal cutoff threshold (0.44). (B) area under the curve (AUC) values for training set (blue) and validation set (purple) across different numbers of features (7, 13, 19, 27), demonstrating model performance with feature selection. (C) AUC difference between training and validation sets across different feature numbers, indicating model generalization performance and potential overfitting assessment.
Multivariate logistic regression analysis based on 19 features
To verify the robustness of key features identified by machine learning, multivariate logistic regression analysis was performed using the 19 features retained by the optimized XGBoost model (Table 3). Results indicated that T stage was the most powerful independent predictor: T4 stage (OR = 16.091, 95% CI 9.711–26.661, P < 0.001) and T3 stage (OR = 14.751, 95% CI 8.326–26.135, P < 0.001) both showed extremely high odds ratios compared with T1 stage, corresponding to 16-fold and 14-fold increased LNM risk, respectively. T2 stage also remained significant (OR = 3.667, 95% CI 2.138–6.291, P < 0.001). MTD emerged as another independent prognostic factor (OR = 2.425, 95% CI 1.682–3.497, P < 0.001), indicating that each incremental unit of tumor size increased the probability of LNM by approximately 2.4-fold. Among pathological characteristics, poorly differentiated tumors were strongly associated with LNM (OR = 5.891, 95% CI 1.477–23.491, P < 0.05), whereas moderately differentiated tumors exhibited a similar trend without reaching significance (OR = 3.322, P = 0.067). Notably, PLR, CEA, and SII did not retain statistical significance in multivariate analysis, implying that their predictive contributions may operate indirectly through interactions or shared pathways with other variables rather than as independent determinants.
Optimized model performance validation and nonlinear relationship analysis of continuous variables
The optimized model based on 19 features demonstrated strong discriminative performance on the independent testing set. ROC curve analysis (Fig. 9A) showed an AUC value of 0.829 (95% CI 0.804–0.854) in the training set and 0.818 (95% CI 0.769–0.866) in the testing set, with an absolute difference of only 0.011, indicating excellent generalization ability and model stability. To further investigate the nonlinear associations between continuous variables and prognostic risk, RCS analysis was performed for six major continuous features (Fig. 9B–G). The SII (Fig. 9B) exhibited a complex nonlinear pattern, with the predicted probability rising steeply between 500 and 1000, then plateauing before showing a slight decline at higher values. FIB (Fig. 9C), CEA (Fig. 9D), and PAR (Fig. 9E) displayed positive linear or mildly nonlinear relationships, where predicted risk progressively increased with rising biomarker values—consistent with their established clinical relevance. PLR (Fig. 9F) demonstrated an inverted U-shaped relationship, with peak risk observed in the 150–250 range, followed by a gradual decline beyond this interval, reflecting the dynamic and bidirectional influence of PLR on prognosis. MTD (Fig. 9G) showed a pronounced S-shaped relationship, where prediction probability increased sharply within the 2–6 cm range before plateauing, suggesting a clear threshold effect beyond which additional tumor enlargement exerted minimal incremental risk. These RCS analyses quantified the continuous and nonlinear relationships between major biomarkers and prognostic risk, offering valuable evidence for risk stratification based on biomarker thresholds and validating that the machine learning model effectively captures intricate nonlinear associations inherent in gastric cancer progression.
Area under the curve (AUC) curve analysis and restricted cubic spline (RCS) analysis for lymph node metastasis (LNM) prediction model. (A) ROC curves comparing training set (AUC = 0.829, 95% CI 0.804-0.854) and test set (AUC = 0.818, 95% CI 0.769–0.866) performance. (B–G) RCS curves showing the relationship between biomarker values and predicted probability of LNM across risk quartiles (K1–K4): (B) SII (systemic immune-inflammation index); (C) FIB (fibrinogen); (D) CEA (carcinoembryonic antigen); (E) unknown biomarker; (F) unknown biomarker; (G) MTD (maximum tumor diameter). Each RCS curve demonstrates the non-linear relationship between biomarker levels and LNM risk, with shaded areas representing confidence intervals.
Discussion
This study employed an XGBoost machine learning framework combined with SHAP interpretability analysis to comprehensively delineate the complex predictive relationships between integrated clinicopathological characteristics and GC-LNM. Among the 1,580 patients included, 62.3% presented with varying degrees of LNM. Patients in the LNM group exhibited significant multidimensional alterations encompassing tumor morphology, inflammatory status, and nutritional–immune balance, establishing a solid biological foundation for precise LNM risk modeling. Analyses ranging from baseline comparisons to univariate regression demonstrated that tumor-related indicators such as MTD, T stage, and differentiation grade were strongly and positively associated with LNM, whereas nutrition-related indices including HRR and PNI acted as negative protective factors. Among inflammation-associated biomarkers, SII, PLR, and AISI emerged as significant risk indicators, collectively suggesting that GC-LNM represents a multifactorial process driven by tumor aggressiveness, host inflammatory responses, and metabolic–nutritional dysregulation4,38.
The optimized XGBoost model incorporating 19 clinicopathological features achieved discriminative performance (training AUC = 0.88; testing AUC = 0.82), underscoring the superiority of multidimensional feature integration for GC-LNM prediction. SHAP analysis confirmed the central importance of T stage as the most critical predictor: T1 contributed negative SHAP values (protective), whereas T4 produced large positive SHAP effects (high-risk). Other key variables—including MTD, PLR, CEA, and FIB—also exhibited substantial predictive contributions. Stratified SHAP dependency analyses further revealed heterogeneity in biomarker behavior across clinical subgroups. MTD demonstrated enhanced predictive efficacy in T4-stage and poorly differentiated tumors; CEA showed the strongest positive SHAP effects in intestinal-type GC; and inflammatory indicators such as PLR and SII exhibited heightened predictive relevance in advanced or poorly differentiated cases. Recursive feature elimination identified the 19-feature configuration as the optimal balance between model accuracy and generalization, confirming the model’s clinical applicability. Multivariate logistic regression validation supported the robustness of these findings, identifying T4 stage, T3 stage, MTD, and poor differentiation as independent predictors, consistent with the SHAP-based interpretability results. Finally, RCS analysis highlighted nonlinear associations between continuous biomarkers and metastatic risk: MTD displayed an S-shaped curve, while PLR followed an inverted U-shaped trend. Together, these quantitative insights provide a data-driven framework for biomarker-based risk stratification and demonstrate how explainable machine learning can capture the complex, nonlinear mechanisms underlying GC-LNM.
In recent years, machine learning (ML) approaches have emerged as promising tools for predicting LNM in gastric cancer, offering advantages in handling complex, high-dimensional data and capturing nonlinear relationships among multiple variables39,40,41. Earlier ML-based prediction models predominantly utilized basic clinicopathological features (age, tumor size, T-stage, differentiation grade) and achieved moderate discriminative performance, with AUC values typically ranging from 0.70 to 0.90. Some studies have incorporated imaging-based radiomics features through deep learning architectures, achieving improved accuracy (AUC 0.80–0.95)42,43,44, though these models often function as “black boxes” lacking mechanistic interpretability and clinical transparency. Other investigations have explored high-dimensional genomic or other signatures45,46, but their clinical translation remains limited due to high costs, specialized testing requirements, and challenges in reproducibility across different institutions. Importantly, most existing ML models have not comprehensively integrated systemic inflammatory, immunonutritional, and metabolic biomarkers—which are readily available from routine clinical testing—nor have they provided stratified interpretability across clinically relevant subgroups or validated clinical utility through decision curve analysis9. These gaps highlight the need for more holistic, interpretable, and clinically actionable prediction frameworks. Building upon these considerations, the present study developed an XGBoost-based prediction model that integrates five key biological dimensions—demographic characteristics, tumor biological behavior, systemic immune-inflammatory status, metabolic and coagulation function, and nutritional-immune competence. By leveraging SHAP for model interpretability and validating clinical utility through calibration and decision curve analyses, this multidimensional framework aims to provide a transparent, robust, and clinically actionable tool for preoperative LNM risk stratification. Ultimately, this approach may enable more informed surgical planning, guide the extent of lymph node dissection, and support personalized treatment strategies—with high-risk patients potentially benefiting from neoadjuvant therapies or more aggressive interventions, while low-risk patients can avoid overtreatment and minimize treatment-related complications.
From a clinical perspective, the current study provides important guidance for prevention, diagnosis, and treatment of GC-LNM47,48. First, disease prevention and risk stratification. The XGBoost prediction model developed in this study provides clinicians with a non-invasive, clinicopathology-based tool for preoperative risk assessment. By integrating multiple variables—such as MTD, tumor characteristics, PLR, SII, and AISI—the model enables accurate identification of patients at high risk for LNM. DCA demonstrated that the model provides substantial net benefit compared to “treat-all” or “treat-none” strategies across threshold probabilities of 0.06–0.85, with optimal performance in the clinically relevant range of 0.30–0.70. This supports the formulation of individualized preventive intervention strategies, especially for early warning of potential LNM in patients with earlier T-stage disease. Furthermore, SHAP-based stratified analyses revealed that patients with different T-stages, Lauren classifications, and histological differentiation grades exhibit distinct biomarker importance patterns, providing scientific evidence for developing precise risk stratification systems tailored to patient-specific characteristics.
Second, diagnostic optimization and preoperative assessment. This study significantly improved diagnostic accuracy and clinical decision-making efficiency for GC-LNM. Traditional imaging examinations such as CT and MRI have limitations in identifying microscopic LNM, while our prediction model achieved an AUC of 0.818 (95% CI 0.769–0.866) on the independent testing set, with sensitivity and specificity of 63.4% and 82.2%, respectively. The model demonstrates excellent calibration (Brier score 0.168) and robust performance under class imbalance (PR-AUC 0.841), representing a 35.6% improvement over random classification. For preoperative lymph node dissection planning, the DCA results provide actionable guidance: patients with predicted probabilities > 0.70 may be considered for extended lymph node dissection or neoadjuvant chemotherapy, while those with probabilities < 0.30 may benefit from limited dissection approaches, thereby minimizing surgical trauma while ensuring oncological adequacy. The model’s flexible threshold-dependent framework allows clinicians to adjust surgical aggressiveness according to individual patient factors and institutional practices.
Third, treatment strategy development and prognostic management. Based on SHAP individual prediction interpretation analysis, clinicians can clearly understand the specific risk factor composition for each patient, thereby developing targeted treatment strategies. For patients with high MTD values and significantly elevated inflammatory indicators, preoperative neoadjuvant chemotherapy may be considered to reduce tumor burden and systemic inflammatory response49,50. For patients with significantly decreased nutritional indicators such as PNI, perioperative nutritional support treatment should be strengthened. Additionally, the nonlinear relationships revealed by restricted cubic spline analyses provide clinically actionable thresholds: PLR values in the 150–250 range and MTD in the 2–6 cm range represent peak risk zones requiring heightened surveillance. In summary, the integration of high discrimination, excellent calibration, and demonstrated clinical net benefit establishes this XGBoost model as a practical clinical decision support tool for GC-LNM risk assessment, surgical planning, and treatment selection.
This study found that several inflammation-related biomarkers—including PLR, SII, and AISI—were significantly elevated in the LNM group. SHAP analysis further confirmed their high importance in the prediction model. These findings underscore the pivotal role of tumor-associated inflammation in driving metastasis13,51,52. Elevated PLR, for instance, reflects an imbalance between platelet activation and suppressed lymphocyte-mediated immune surveillance. Platelets are known to promote metastasis by facilitating angiogenesis through vascular endothelial growth factor (VEGF) and platelet-derived growth factor (PDGF), and by shielding circulating tumor cells from immune attack via the formation of platelet–tumor cell aggregates53,54,55. Interestingly, some biomarkers showed variable predictive directions across different subgroups. This context-dependent behavior likely arises from biological heterogeneity under diverse clinical or physiological states. For example, while elevated inflammatory markers may indicate tumor progression or immune evasion in certain patients56,57, they might instead represent robust immune responses or compensatory mechanisms in others—thus correlating with reduced metastatic risk. Such inconsistencies suggest that predictive modeling should account for interaction effects and potential stratification. Future research should explore interaction term modeling and stratified analyses to enhance the model’s generalizability and clinical interpretability.
Changes in nutritional-immune biomarkers, such as reduced PNI and elevated PAR and RAR, were strongly associated with LNM. These indices not only reflect systemic nutritional and immune status but also suggest that tumors may promote metastasis by reprogramming host metabolism to create a supportive microenvironment58,59,60. Low PNI, in particular, may indicate both malnutrition and immune suppression, serving as a critical link between host status and metastatic potential61,62. SHAP-based stratified analysis revealed significant differences in biomarker importance patterns among patients with varying Lauren classifications and histological differentiation grades. For instance, MTD exhibited a stronger predictive effect in intestinal-type GC, whereas CEA showed the most prominent positive contribution in this same subtype. These results reflect the molecular and biological diversity of GC and offer mechanistic insights into how distinct tumor phenotypes influence metastatic behavior. Finally, RCS analysis uncovered critical nonlinear relationships between continuous variables and LNM risk. Notably, MTD demonstrated an S-shaped association, suggesting a threshold effect beyond which risk sharply increases, while PLR showed an inverted U-shaped relationship—indicating that both extremely low and extremely high PLR levels may be associated with reduced metastasis risk. These nonlinear patterns offer important quantitative evidence for establishing precise biomarker thresholds and optimizing risk stratification strategies. Such insights are crucial for tailoring individualized surveillance and intervention plans.
Despite the significant progress achieved in predictingLNM in GC, this study has several limitations. First, this study adopted a single-center retrospective design, including data from 1,580 patients at Lanzhou University Second Hospital. Retrospective analyses are inherently subject to selection bias and information bias, as clinical data collection depends on the completeness and accuracy of medical records, which may result in missing or inaccurate information. Second, the study population was derived entirely from a single medical center, which may introduce regional and institutional bias and limit the external validity and generalizability of the findings. Differences in genetic background, lifestyle, dietary habits, and healthcare resources across regions could influence GC incidence and prognostic patterns. We plan to conduct multicenter prospective studies in the future to further validate the model’s generalizability and clinical utility across different healthcare settings. Third, the selection of the five core feature modules—demographic characteristics, tumor biological behavior, systemic inflammatory status, metabolic and coagulation function, and nutritional–immune capacity—was primarily constrained by the structure and availability of collected data. Although these modules reflect clinically relevant biological dimensions, other potentially valuable variables could not be incorporated due to dataset limitations. Fourth, although this study included multiple clinicopathological and biochemical biomarkers, the testing periods, methods, and reference ranges may have shown minor variability over the ten-year data collection period (2013–2023). While some laboratory instruments were updated, reference intervals and internal quality-control standards remained consistent. Tumor staging and maximum diameter were obtained from standardized pathology reports, minimizing potential measurement bias caused by equipment differences. Fifth, preoperative imaging indicators (e.g., CT or MRI features) were not included in this analysis. Given the wide time span of the dataset, substantial variability existed in imaging quality and scanning protocols due to technological updates. Moreover, our team currently lacks the infrastructure for standardized imaging preprocessing and multimodal data fusion. Future work will focus on integrating imaging features to further improve predictive accuracy. Sixth, although the XGBoost model demonstrated good predictive performance and stability, the current validation was limited to internal validation. External validation using independent datasets from multiple institutions is still required to confirm model robustness in real-world clinical settings. Seventh, this study primarily focused on LNM prediction and did not assess long-term outcomes such as overall survival (OS) or disease-free survival (DFS). Although LNM is an important prognostic factor, its relationship with patient prognosis requires further investigation through long-term prospective follow-up. Finally, while RFE effectively optimized model performance, this method may still involve a risk of overfitting and could potentially exclude features with important biological significance but weaker statistical associations. Future research will consider combining machine learning–based selection with domain-driven feature engineering to enhance both robustness and interpretability.
Conclusions
This study developed a multi-module machine learning model for predicting LNM in GC based on clinicopathological biomarkers. Using data from 1,580 patients, the XGBoost model—integrating 19 key features including tumor T-stage, MTD, and PLR—demonstrated robust predictive performance and strong clinical applicability. The multi-module framework enabled the incorporation of complementary biological dimensions such as tumor burden, systemic inflammation, and nutritional status, thereby enhancing both prediction accuracy and biological interpretability. SHAP-based interpretability analysis identified tumor T-stage as the most influential predictor and uncovered substantial heterogeneity in biomarker contributions across different pathological subtypes. Collectively, these findings not only improve the understanding of GC-LNM pathogenesis but also underscore the pivotal roles of tumor microenvironment–associated inflammation and nutritional metabolic reprogramming in promoting metastasis.
Data availability
The data and code that support the findings of this study are available from the corresponding author or the first author upon reasonable request and for justified academic use.
Abbreviations
- AISI:
-
Aggregate index of systemic inflammation
- AJCC:
-
American Joint Committee on Cancer
- AUC-ROC:
-
Area under the receiver operating characteristic curve
- CEA:
-
Carcinoembryonic antigen
- CI:
-
Confidence intervals
- dNLR:
-
Derived neutrophil-to-lymphocyte ratio
- FIB:
-
Fibrinogen
- GC:
-
Gastric cancer
- HRR:
-
Hemoglobin-to-red cell distribution width ratio
- LNM:
-
Lymph node metastasis
- MLR:
-
Monocyte-to-lymphocyte ratio
- MTD:
-
Maximum tumor diameter
- NLR:
-
Neutrophil-to-lymphocyte ratio
- NPR:
-
Neutrophil-to-platelet ratio
- OR:
-
Odds ratios
- PAR:
-
Platelet-to-albumin ratio
- PDGF:
-
Platelet-derived growth factor
- PLR:
-
Platelet-to-lymphocyte ratio
- PNI:
-
Prognostic nutritional index
- PT:
-
Prothrombin time
- RAR:
-
Red cell distribution width-to-albumin ratio
- RCS:
-
Restricted cubic splines
- RCS:
-
Restricted cubic spline
- RFE:
-
Recursive feature elimination
- SHAP:
-
SHapley Additive exPlanations
- SII:
-
Systemic immune-inflammation index
- SIRI:
-
Systemic inflammatory response index
- UICC:
-
Union for International Cancer Control
- VEGF:
-
Vascular endothelial growth factor
- VIF:
-
Variance inflation factors
- XGBoost:
-
eXtreme gradient boosting
References
Bray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 74(3), 229–263 (2024).
Dong, D. et al. Deep learning radiomic nomogram can predict the number of lymph node metastasis in locally advanced gastric cancer: an international multicenter study. Ann. Oncol. 31(7), 912–920 (2020).
Zhang, X. M., Shen, W. W. & Song, L. J. Prognostic and predictive values of the grading system of lymph node status in patients with advanced-stage gastric cancer. Front. Oncol. 13, 1183784 (2023).
Vos, E. L. et al. Risk of lymph node metastasis in T1b gastric cancer: an international comprehensive analysis from the global gastric group (G3) alliance. Ann. Surg. 277(2), e339–e345 (2023).
Akiyama, Y. et al. Frequency of lymph node metastasis according to tumor location in clinical T1 early gastric cancer: supplementary analysis of the Japan clinical oncology group study (JCOG0912). J. Gastroenterol. 58(6), 519–526 (2023).
Liu, S. et al. Establishment and validation of a risk score model based on EUS: assessment of lymph node metastasis in early gastric cancer. Gastrointest. Endosc. 100(5), 857–866 (2024).
Zhou, C. M. et al. Machine learning predicts lymph node metastasis of poorly differentiated-type intramucosal gastric cancer. Sci. Rep. 11(1), 1300 (2021).
Izumi, D. et al. A genomewide transcriptomic approach identifies a novel gene expression signature for the detection of lymph node metastasis in patients with early stage gastric cancer. EBioMedicine 41, 268–275 (2019).
Zhu, Z. et al. Integrating machine learning and the SHapley additive explanations (SHAP) framework to predict lymph node metastasis in gastric cancer patients based on inflammation indices and peripheral lymphocyte subpopulations. J. Inflamm. Res. 17, 9551–9566 (2024).
Dai, X. R. et al. Diagnostic value of systemic immune-inflammation index and prognostic nutritional index combined with CEA in gastric cancer with lymph node metastasis. Front. Endocrinol. 16, 1522349 (2025).
Kosuga, T. et al. Clinical significance of neutrophil-to-lymphocyte ratio as a predictor of lymph node metastasis in gastric cancer. BMC Cancer. 19(1), 1187 (2019).
Zhang, X. et al. Predictive value of neutrophil-to-lymphocyte ratio for distant metastasis in gastric cancer patients. Sci. Rep. 12(1), 10269 (2022).
Wang, H. et al. Nomogram based on preoperative fibrinogen and systemic Immune-Inflammation index predicting recurrence and prognosis of patients with Borrmann type III advanced gastric cancer. J. Inflamm. Res. 16, 1059–1075 (2023).
Zhang, J. et al. Single and combined use of the platelet-lymphocyte ratio, neutrophil-lymphocyte ratio, and systemic immune-inflammation index in gastric cancer diagnosis. Front. Oncol. 13, 1143154 (2023).
Qiao, W. et al. Association between multiple coagulation-related factors and lymph node metastasis in patients with gastric cancer: A retrospective cohort study. Front. Oncol. 13, 1099857 (2023).
Ma, H. et al. Platelet-to-albumin ratio: a potential biomarker for predicting all-cause and cardiovascular mortality in patients undergoing peritoneal Dialysis. BMC Nephrol. 25(1), 365 (2024).
Xu, Q. et al. Association between fibrinogen-to-albumin ratio and prognosis of patients with heart failure. Eur. J. Clin. Invest. 53(10), e14049 (2023).
Guo, H., Wang, Y., Miao, Y. & Lin, Q. Red cell distribution width/albumin ratio as a marker for metabolic syndrome: findings from a cross-sectional study. BMC Endocr. Disorders. 24(1), 227 (2024).
Jing, Y. et al. The effect of systemic Immune-Inflammatory index (SII) and prognostic nutritional index (PNI) in early gastric cancer. J. Inflamm. Res. 17, 10273–10287 (2024).
Lai, T., Liang, Y., Guan, F. & Hu, K. Trends in hemoglobin-to- red cell distribution width ratio and its prognostic value for all-cause, cancer, and cardiovascular mortality: a nationwide cohort study. Sci. Rep. 15(1), 7685 (2025).
Fang, C. et al. Proposal and validation of a modified staging system to improve the prognosis predictive performance of the 8th AJCC/UICC pTNM staging system for gastric adenocarcinoma: a multicenter study with external validation. Cancer Commun. (London England). 38(1), 67 (2018).
Jiang, P., Chen, J. & Li, J. Association of the systemic immune-inflammatory index and systemic inflammatory response index with all-cause and cardiovascular mortality in individuals with metabolic inflammatory syndrome. Eur. J. Med. Res. 30(1), 444 (2025).
Ma, F. et al. The relationship between systemic inflammation index, systemic immune-inflammatory index, and inflammatory prognostic index and 90-day outcomes in acute ischemic stroke patients treated with intravenous thrombolysis. J. Neuroinflamm. 20(1), 220 (2023).
Jiang, M., Li, X. & Lu, Y. Hemoglobin - to - red cell distribution width ratio in depression symptoms: threshold effects and metabolic - inflammatory mediation revealed by multimodal machine learning and symptom network analysis in 196,260 adults. J. Affect. Disord. 387, 119515 (2025).
Krishna, C. V. M., Rao, G. A. & Anuradha, S. Analysing the impact of contextual segments on the overall rating in multi-criteria recommender systems. J. Big Data. 10(1), 16 (2023).
Sofer, T. et al. Variant-specific inflation factors for assessing population stratification at the phenotypic variance level. Nat. Commun. 12(1), 3506 (2021).
Lin, L. et al. FRP-XGBoost: identification of ferroptosis-related proteins based on multi-view features. Int. J. Biol. Macromol. 262(Pt 2), 130180 (2024).
Takefuji, Y. Beyond XGBoost and SHAP: unveiling true feature importance. J. Hazard. Mater. 488, 137382 (2025).
Andayeshgar, B., Abdali-Mohammadi, F., Sepahvand, M., Almasi, A. & Salari, N. Arrhythmia detection by the graph Convolution network and a proposed structure for communication between cardiac leads. BMC Med. Res. Methodol. 24(1), 96 (2024).
Zhan, Z. et al. Identification of prognostic signatures in remnant gastric cancer through an interpretable risk model based on machine learning: a multicenter cohort study. BMC Cancer. 24(1), 547 (2024).
Han, S. S. et al. Keratinocytic skin cancer detection on the face using Region-Based convolutional neural network. JAMA Dermatology. 156(1), 29–37 (2020).
Zhang, Y. et al. Opening the black box: interpretable machine learning for predictor finding of metabolic syndrome. BMC Endocr. Disorders. 22(1), 214 (2022).
Geng, Z. et al. Multichannel deep learning prediction of major pathological response after neoadjuvant immunochemotherapy in lung cancer: a multicenter diagnostic study. International J. Surg. (London England)(2025).
Cai, Z. et al. Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learning. Nat. Commun. 15(1), 10390 (2024).
Crombé, A. & Kataoka, M. Breast cancer molecular subtype prediction: improving interpretability of complex machine-learning models based on multiparametric-MRI features using SHapley additive explanations (SHAP) methodology. Diagn. Interv. Imaging. 105(5), 161–162 (2024).
Guo, Z. et al. Associations of urinary nicotine metabolites and essential metals with metabolic syndrome in older adults: the mediation effect of insulin resistance. J. Hazard. Mater. 480, 135969 (2024).
Yu, B. et al. The non-high-density lipoprotein cholesterol to high-density lipoprotein cholesterol ratio (NHHR) as a predictor of all-cause and cardiovascular mortality in US adults with diabetes or prediabetes: NHANES 1999–2018. BMC Med. 22(1), 317 (2024).
Li, G. Z., Doherty, G. M. & Wang, J. Surgical management of gastric cancer: A review. JAMA Surg. 157(5), 446–454 (2022).
Shin, H. J. et al. Clinical impact of gastrectomy in surgically proven stage IV gastric cancers: retrospective analysis from Korean multicenter dataset (PASS-META). Gastric Cancer(2025).
Kato, M. et al. A machine learning model for predicting the lymph node metastasis of early gastric cancer not meeting the endoscopic curability criteria. Gastric Cancer. 27(5), 1069–1077 (2024).
Liu, Y., Song, C., Tian, Z. & Shen, W. Ten-Year multicenter retrospective study utilizing machine learning algorithms to identify patients at high risk of venous thromboembolism after radical gastrectomy. Int. J. Gen. Med. 16, 1909–1925 (2023).
Shi, A. et al. Multi-regional feature integration on enhanced CT for lymph node metastasis prediction in gastric cancer: a novel radiomics approach. BMC Med. Imaging. 25(1), 419 (2025).
Qiu, B. et al. Multitask deep learning based on longitudinal CT images facilitates prediction of lymph node metastasis and survival in Chemotherapy-Treated gastric cancer. Cancer Res. 85(13), 2527–2536 (2025).
Ling, T., Zuo, Z., Huang, M., Ma, J. & Wu, L. Stacking classifiers based on integrated machine learning model: fusion of CT radiomics and clinical biomarkers to predict lymph node metastasis in locally advanced gastric cancer patients after neoadjuvant chemotherapy. BMC Cancer. 25(1), 834 (2025).
Zhang, C. et al. Determination of survival of gastric cancer patients with distant lymph node metastasis using prealbumin level and prothrombin time: contour plots based on random survival forest algorithm on High-Dimensionality clinical and laboratory datasets. J. Gastric Cancer. 22(2), 120–134 (2022).
Lee, H. D. et al. Development and validation of models to predict lymph node metastasis in early gastric cancer using logistic regression and gradient boosting machine methods. Cancer Res. Treat. 55(4), 1240–1249 (2023).
Tian, Y. et al. Clinical implications of micro lymph node metastasis for patients with gastric cancer. BMC Cancer. 23(1), 536 (2023).
Wang, D., Wang, Y., Dong, L., Zhang, X. & Du, J. Preoperatively predicting the lymph node metastasis and prognosis for gastric cancer patients. Sci. Rep. 14(1), 11213 (2024).
Dai, G., Chen, M. G., Zhu, D. F., Cai, Y. T. & Gao, M. Risk factors of positive lymph node metastasis after radical gastrectomy for gastric cancer and construction of prediction models. Am. J. Cancer Res. 14(11), 5216–5229 (2024).
Yue, C. & Xue, H. Construction and validation of a nomogram model for lymph node metastasis of stage II-III gastric cancer based on machine learning algorithms. Front. Oncol. 14, 1399970 (2024).
Yu, Z. et al. Oncostatin M receptor, positively regulated by SP1, promotes gastric cancer growth and metastasis upon treatment with Oncostatin M. Gastric Cancer. 22(5), 955–966 (2019).
Zhou, Z. et al. A C-X-C chemokine receptor type 2-Dominated Cross-talk between tumor cells and macrophages drives gastric cancer metastasis. Clin. Cancer Res. 25(11), 3317–3328 (2019).
Asgari, A. et al. Platelets stimulate programmed death-ligand 1 expression by cancer cells: Inhibition by anti-platelet drugs. J. Thromb. Haemostasis: JTH. 19(11), 2862–2872 (2021).
Cecerska-Heryć, E. et al. Applications of the regenerative capacity of platelets in modern medicine. Cytokine Growth Factor Rev. 64, 84–94 (2022).
Peterson, J. E. et al. VEGF, PF4 and PDGF are elevated in platelets of colorectal cancer patients. Angiogenesis 15(2), 265–273 (2012).
Zhang, T. et al. Intratumoral Fusobacterium nucleatum recruits Tumor-Associated neutrophils to promote gastric cancer progression and immune evasion. Cancer Res. 85(10), 1819–1841 (2025).
Wang, Y. et al. Neutrophils promote tumor invasion via FAM3C-mediated epithelial-to-mesenchymal transition in gastric cancer. Int. J. Biol. Sci. 19(5), 1352–1368 (2023).
Matsunaga, T. et al. Impact of geriatric nutritional risk index on outcomes after gastrectomy in elderly patients with gastric cancer: a retrospective multicenter study in Japan. BMC Cancer. 22(1), 540 (2022).
Zhang, H. et al. Optimized dynamic network biomarker Deciphers a High-Resolution heterogeneity within thyroid cancer molecular subtypes. Med. Res. 1(1), 10–31 (2025).
Zhang, P. et al. Mitochondrial Pathway Signature (MitoPS) predicts immunotherapy response and reveals NDUFB10 as a key immune regulator in lung adenocarcinoma. Journal for immunotherapy of cancer 13(7). (2025).
Li, D., Sun, J., Qi, C., Fu, X. & Gao, F. Predicting severity of inpatient acute cholangitis: combined neutrophil-to-lymphocyte ratio and prognostic nutritional index. BMC Gastroenterol. 24(1), 468 (2024).
Pan, Y., Ma, Y. & Dai, G. The Prognostic Value of the Prognostic Nutritional Index in Patients with Advanced or Metastatic Gastric Cancer Treated with Immunotherapy. Nutrients 15(19). (2023).
Acknowledgements
We sincerely thank all members of our research team for their valuable contributions and close collaboration throughout the study.
Funding
This work was funded by the Fundamental Research Funds for the Central Universities of Lanzhou University [grant number: lzujbky-2022-sp08], the Medical Innovation and Development Project of Lanzhou University [grant number: lzuyxcx-2022-154, and lzuyxcx-2022-141], and the Major Science and Technology Projects of Gansu Province [grant number: 20ZD7FA003, 22ZD6FA050, and 22JR9KA002].
Author information
Authors and Affiliations
Contributions
Feifei Ding: Data curation, Investigation, Writing—original draft. Zhijun Feng: Formal Analysis, Software, Writing—original draft. Binjie Huang and Jie Liu: Validation, Visualization. Yumin Li: Conceptualization, Methodology, Project administration, Writing—review & editing. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ding, F., Huang, B., Liu, J. et al. Interpretable machine learning analysis of clinicopathological and immunonutritional biomarkers for predicting lymph node metastasis in gastric cancer. Sci Rep 15, 44964 (2025). https://doi.org/10.1038/s41598-025-29004-3
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-29004-3








