Introduction

Colorectal cancer (CRC) remains one of the greatest threats to human health today, which is ranked as the third most prevalent malignant tumor, accounting for 10% of all new malignancies1. Additionally, CRC stands for the second leading cause of cancer-related deaths, constituting 9.4% of all cancer deaths2. Accurate prognostic staging of CRC patients is the crux to improving patient prognosis by opting for appropriate therapeutic treatments.

Lymph node metastasis (LNM) is one of the most critical prognostic factors in patients with CRC3,4. However, the current standard for staging, the eighth edition of American Joint Committee on Cancer (AJCC 8th) N staging system, which is the most widely utilized in clinical practice, is not flawless5,6. It has been querid for its limited inclusion of factors necessary for accurately staging lymph node metastases. Particularly, the number of examined lymph nodes is omitted bringing it under the greatest skepticism7. Studies have shown that CRC patients with adequate lymph node dissection have a significantly better prognosis than those with inadequate lymph node dissection8,9. However, the AJCC N staging system only considers the number of positive lymph nodes in the prognostic stratification of patients, which may lead to a potential bias10. Therefore, the concept of lymph node ratio (LNR) is proposed, which is the ratio of the number of positive lymph nodes to the number of detected lymph nodes11. LNR’s stratification performance is theoretically superior to the number of positive lymph nodes due to the inclusion of the number of cleared lymph nodes. Previous studies have demonstrated this conclusion, and LNR has been shown to have excellent capabilities among several types of cancers, including lung, oral cavity, breast, and esophageal cancers12,13,14,15,16,17,18. LNR has also been identifiedas an independent prognostic factor for CRC19,20. Nonetheless, a majority of these studies, in further searching for the cutoff value of LNR, have adopted less rigorous methods of statistical analysis, such as using the mean values, quartiles, or other arbitrary categorization, and therefore do not stratify the LNR properly21,22,23, or more scientific approach has been taken to strata, but lack adequate samples to yield accurate and representative results20.

Addtionally, due to superior performance considering both positive and cleared lymph nodes, log odds of positive lymph nodes (LODDS) is also considered as a lymph node prognostic stratification criterion. s24. The LODDS is defined as the logarithm of the ratio of the probability of being positive to the probability of being negative when 1 lymph node is examined, and its stratifying properties have also been verified in gastric, colorectal, and bladder cancers25,26,27,28.

In this study, to construct a proper Lymph node stratification model and determine its optimal cutoff values in stage III CRC patients who received radical surgery, we conduct a comprehensive statistical analysis based on the data from the Surveillance, Epidemiology and End Results (SEER) database and Zhejiang Cancer Hospital, China.

Methods

Patients and study design

Two cohorts were included in this study. The SEER database cohort was derived from CRC patients recorded in SEER program during 2010–2020. The external validation cohort was obtained from CRC patients admitted between January 2007 and July 2022 in Zhejiang Cancer Hospital. This study was approved by the research ethics committee of Zhejiang Cancer Hospital (grant number 2022KY628), and the informed consent of patients was waived. Inclusion criteria were patients with CRC who were pathologically diagnosed with AJCC stage III with LNM after radical surgery. All methodologies applied in this study were carried out in accordance with the relevant guidelines and regulations. Patients with incomplete clinicopathological or follow-up information were excluded. A total of 66,407 eligible patients were included in SEER database cohort and 4,062 in external validation cohort. A detailed flowchart diagram is shown in supplementary Fig. 1. The clinicopathological dataset included gender, age, race, primary tumor location, Infiltration depth, Peripheral nerve invasion, tumor deposits (TD) status, Number of LNM, Number of lymph nodes dissected, Preoperative carcinoembryonic antigen (CEA), Neoadjuvant and adjuvant therapy.

Lymph node classification and modified N stage

The number of metastatic lymph nodes was classified according to the AJCC 8th edition TNM classification as N1a (1); N1b (2–3); N2a (4–6); and N2b (greater than 6)29. The LNR was categorized into four classes corresponding to the AJCC N stage based on the cutoff values determined by the R software.

Follow-up

Follow-up data of patients in the SEER database cohort was obtained from the SEER program, and external validation cohort collected from hospital’s follow-up system, the frequency of follow-up of external validation cohort was every six months for five years after surgery and once a year after five years until death. The frequency of follow-up for the SEER database cohort was detailed on the SEER website. The median follow-up time for the SEER database cohort was 72 months (range: 55–139 months). The median follow-up time for the external validation cohort was 68 months (range: 58–144 months).

Statistical analysis

R (version 4.2.1) along with SPSS (version29.0.1.0) software were used for statistical analyses. Overall survival (OS) was the primary outcome of this study. The prognostic stratification performance between different staging models was compared by Time-dependent Area Under Curve (TAUC). TAUC was the comparison of AUC values between different models at different time points, AUC values in this study were calculated at a frequency of once every two months from 0 to 100 months and plotted them into curve. Cutoff values were found using the R package “maxstat”30, which was capable of determining the best cutoff value for a continuous variable by means of the maximum selective rank statistic, a result-oriented method that provides the cutoff value with the most significant association between the continuous variable and the patient’s prognosis. Frequency differences between groups were verified by chi-square test, and prognostic differences were validated by Kaplan-Meier survival curves and log-rank test. Cox regression analysis was utilized to verify whether a variable was an independent risk factor for prognosis, as well as to compute the hazard ratio (HR) and 95% confidence interval (CI). P-value < 0.05 was defined as statistically significant. The observed confounders were balanced between groups using propensity score matching (PSM)31.

Results

The detailed flowcharts presented in supplementary Fig. 1, a total of 66,407 patients diagnosed with stage III CRC were enrolled as SEER database cohort, and 4,062 patients were included as external validation cohort. The baseline of the enrolled patients is listed in Table 1.

Table 1 Baseline characteristics of the study population in SEER database cohort and external validation cohort.

AJCC N stage which neglects the number of lymph nodes dissection is defective.

The log-rank test oriented to prognostic outcomes was performed to calculate the optimal cutoff value, and the optimal cutoff value for the number of lymph nodes was 14. Using 14 as a criterion for grouping the number of lymph nodes examined, a significant prognostic difference between patients with adequate and inadequate lymph node dissection could be observed in the Kaplan-Meier survival analysis, as is shown in supplementary Fig. 2 (P < 0.001, 5-year OS: 52.4% vs. 61.2%).

In Fig. 1 and supplementary Fig. 3, according to the number of lymph node metastases, patients with fewer than 14 metastatic lymph nodes were categorized into 13 groups from 1 to 13. Subsequently, each group was subgrouped according to whether or not the number of lymph nodes dissection reached 14. In the subsequent Kaplan-Meier survival analysis, a statistically difference in survival was identified between patients grouped by the number of dissected lymph nodes regardless of the number of lymph nodes metastasized.Interestingly, the difference tended to increase with the increase of the number of LNM (all P < 0.001).

Fig. 1
figure 1

Kaplan-Meier survival analysis between adequate and inadequate lymph node dissection when the number of positive lymph nodes was 5 (A), 6 (B), 7 (C), 8 (D), 9 (E), 10 (F), 11 (G), 12 (H), and 13 (I), respectively.

The performance of LNR is superior to the number of lymph nodes in prognostic stratification.

Firstly, LNR and LODDS were used as continuous variables to compare with the number of metastatic lymph nodes. As the TAUC curves shown in Fig. 2A, it could be clearly noted that the stratification performance of LNR was similar to LODDS, but obviously superior to the number of LNM. To determine the number of lymph nodes to be examined, which might facilitate guidance in clinical practice, the same methodology was used to find the optimized cutoff value for LNR. Figure 3A and C exhibited that three optimal LNR cutoffs were identified, dividing the LNR into four stages matching the AJCC N stages, which are modified N1a (LNR < 0.11), modified N1b (0.11 ≤ LNR < 0.39), modified N2a (0.39 ≤ LNR < 0.68), and modified N2b (LNR ≥ 0.68). Furthermore, as depicted in Fig. 3D and F, in the Kaplan-Meier survival analysis of the groupings in accordance with the cutoff values obtained above, a statistically significant difference was presented in the prognosis of the patients between each of the groups (LNR < 0.39 vs. LNR ≥ 0.39: P < 0.001, 5-year OS—63.2% vs. 36.5%; 0 < LNR < 0.11 vs. 0.11 ≤ LNR < 0.39: P < 0.001, 5-year OS—68.8% vs. 57.7%; 0.39 ≤ LNR < 0.68 vs. LNR ≥ 0.68: P < 0.001, 5-year OS—42.0% vs. 24.5%). After that, the stratification performance of modified N stage was compared with that of AJCC N stage in Fig. 2B, it could be noticed that after stratification of the continuous variable LNR, although its AUC decreased but was still overall higher than the AJCC N stage. In addition, patients with an insufficient number of lymph nodes dissected (examined lymph nodes < 14) were screened to compare the stratification performance of the two N stage systems individually in Fig. 2C, and the results indicated that even with an insufficient number of lymph nodes examined, the modified N stage performed better than the AJCC N stage. Besides, Table 2 revealed that modified N stage was an independent risk factor in both the univariate and the multivariate Cox regression analyses (HR:1.66, 95%CI: 1.58–1.74). Subsequently, PSM were used to balance the confounders between groups in supplementary Table 1. In the univariate and the multivariate Cox regression analysis conducted after PSM, modified N stage remained a significant independent risk factor, as indicated in Supplementary Table 2 (HR: 1.68, 95%CI: 1.58–1.78).

Fig. 2
figure 2

Performance comparison based on Time-dependent Area Under Curve between different lymph node prognostic stratification models: All patients—LNR vs. LODDS vs. number of positive lymph node (A); All patients—modified N stage vs. AJCC N stage (B); Patients with lymph nodes less than 14—modified N stage vs. AJCC N stage when examined (B).

Fig. 3
figure 3

The first LNR-based prognostic stratification (A) and Kaplan-Meier survival analysis (D) for all patients; The second LNR-based prognostic stratification (B) and Kaplan-Meier survival analysis (E) for the patients with LNR < 0.26; The third LNR-based prognostic stratification (C) and Kaplan-Meier survival analysis (F) for the patients with LNR > 0.26.

Table 2 Univariate and multivariate Cox regression analyses based on the SEER database cohort.

Supplementary Table 3 demonstrated the distribution of patients in AJCC N stage and modified N stage, the most significant distributional differences between the two different N staging criteria were mainly in N1b (mN1a: 35.3%, mN1b: 62.1%), N2a (mN1b: 80.5%, mN2a: 13.3%), N2b (mN1b: 30.5%, mN2a: 43.8%, mN2b: 25.4%), mN1b (N1b: 46.9%, N2a: 36.3%), and mN2a (N2a: 25.5%, N2b: 67.1%). To further clarify the distributional differences, Kaplan-Meier survival analysis was utilized to profile the prognosis within each subgroup that differed in Fig. 4. The results showed significant survival differences within the AJCC N stage and between the different modified N stage subgroups (N1b—mN1a vs. mN1b: P < 0.001, 5-year OS—68.6% vs. 58.9%; N2a—mN1b vs. mN2a: P < 0.001, 5-year OS—56.9% vs. 42.0%; N2b—mN1b vs. mN2a vs. mN2b: P < 0.001, 5-year OS—55.8% vs. 42.1% vs. 22.9%). However, within mN2a, there was no statistically significant difference in prognosis between the N2a and N2b groups (P = 0.910, 5-year OS—42.0% vs. 42.1%), where significant distributional differences existed. Furthermore, within mN1b, even though statistical differences in prognosis existed among subgroups (P = 0.021, 5-year OS—58.9% vs. 56.9%), the differences were much smaller than the differences in modified N stage within AJCC N stage.

Fig. 4
figure 4

Kaplan-Meier survival analysis based on differences in patient distribution between AJCC N stage and modified N stage: N1b—mN1a vs. mN1b (A); N2a—mN1b vs. mN2a (B); N2b—mN1b vs. mN2a vs. mN2b (C); mN1b—N1b vs. N2a(D); mN2a—N1b vs. N2a vs. N2b (E).

The external validation cohort was employed to conduct further analyses in Supplementary Fig. 4, the findings demonstrated that prognosis was significantly different between each of the modified subgroups (LNR < 0.39 vs. LNR ≥ 0.39: P < 0.001, 5-year OS—73.5% vs. 42.8%; 0 < LNR < 0.11 vs. 0.11 ≤ LNR < 0.39: P < 0.001, 5-year OS—79.5% vs. 67.6%; 0.39 ≤ LNR < 0.68 vs. LNR ≥ 0.68: P = 0.021, 5-year OS—45.0% vs. 34.2%), and the modified N stage was also superior to the AJCC N stage in terms of stratification performance. In subsequent univariate and the multivariate Cox regression analyses in Supplementary Table 4, Supplementary Tables 5 and Supplementary Table 6, modified N staged was always an independent risk factor, regardless of whether it was before (HR: 1.94, 95%CI: 1.59–2.36) or after PSM (HR: 1.81, 95%CI: 1.45–2.26).

Discussion

Although the AJCC TNM staging system is the most widely used clinical staging protocol, its accuracy has been questioned, especially in the N staging of metastatic lymph nodes6,7. Firstly, it ignores the effect of the location of positive lymph nodes on the prognosis, and only uses the number of lymph nodes as the only stratification criterion29. However, some studies have indicated that D3 station lymph node metastasis, including the main and lateral lymph nodes, is an independent prognostic factor for CRC patients32, and Japanese Society for Cancer of the Colon and Rectum (JSCCR) staging also classified CRC patients with D3 station lymph node metastasis to the new jN3 stage33, so as to improve the clinical attention to such patients, and then give appropriate and sufficient treatment methods, which may render the patients achieve the better prognosis34. Moreover, the importance of TD for prognostic stratification has not been fully appreciated35, and AJCC N stage only includes TD-positive patients in N1c when there is no LNM. More importantly, the neglect of the number and extent of lymph node dissection makes its accuracy questionable. At present, for CRC, the scope of lymph node dissection is mainly affected by the following aspects, first of all, according to the specific conditions of the tumor (such as the depth of invasion, the risk of lymph node metastasis, etc.) to decide. For high-risk patients, such as those with stage T3 to 4 CRC, D3 dissection may be recommended. Secondly, the surgeon’s lack of familiarity with the anatomical structure may lead to incomplete dissection or accidental injury of surrounding tissues, resulting in postoperative complications. Thirdly, laparoscopic or robot-assisted surgery may provide more accurate identification and removal of lymph nodes, whereas traditional open surgery may have limited visibility and flexibility. Fourth, abdominal adipose tissue in obese patients may block the visual field, which increases the difficulty of lymph node localization and dissection36,37,38. Thus, the concepts of LNR and LODDS are proposed, both of which take into account the number of positive lymph nodes along with the number of lymph nodes cleared, and theoretically have better prognostic stratification performance compared to incorporating only the number of positive lymph nodes. This view was supported in the comparison of the prognostic stratification performance for the different variables in this study, but the performance between LNR and LODDS was almost identical, and considering the more difficult calculation process of LODDS, LNR seems to be the most suitable stratification model for lymph node metastasis for clinical application. Subsequently, we divided the LNR into four levels according to the optimal cutoff value, turned it into a level variable-modified N stage, and compared it with the AJCC N stage, and the results showed that after becoming a level variable, although the performance decreased, it was still higher than that of the AJCC N stage, which was based on the number of lymph nodes. In addition, the superiority of LNR was reinforced in the comparison of the distributional discrepancies between the modified N stage and the AJCC N stage. Within the modified N stage, where the distribution of AJCC N stage varied widely, Kaplan-Meier survival analysis showed no or slight difference in survival between the subgroups. However, within the AJCC N stage, in which the distribution of the modified N stage varies considerably, the Kaplan-Meier survival analysis showed significant differences in survival between subgroups. The reason for this may be that modified N stage allows patients who were misstaged by AJCC N stage to return to the correct stage. For example, a patient in whom 7 of the 20 lymph nodes dissected at radical resection are determined to be positive on postoperative pathological examination would be classified as N2b according to the AJCC staging system and mN1b according to the modified staging system. However, the prognosis of the patient with adequate lymph node dissection is significantly better than that of the patient with inadequate lymph node dissection. Therefore, the patient should not be classified into the later stage to avoid the side effects of overtreatment.

Regarding the possible deficiencies of the current AJCC staging in terms of N staging, according to the findings of this study and previous studies, the following measures may be able to further improve the deficiencies, enhance its staging performance, more accurately make prognostic stratification of CRC patients, and further formulate individualized treatment strategies to maximize the possibility of achieving an optimal prognosis. First, the LNR should be used as a core indicator for N staging, replacing the current standard that relies only on the number of positive lymph nodes. Second, improve the classification of D3 station metastasis (similar to the jN3 staging of JSCCR) to reflect its prognostic significance. Third, separate stratification of TD-positive patients instead of classifying them as N1c only (current AJCC criteria). Fourth, combine modified N staging with T staging to redefine the criteria for subtypes IIIA, IIIB, and IIIC to avoid excessive heterogeneity within AJCC stage III.

Several studies had suggested that LNR performance is limited when lymph node clearance is inadequate39. In this study, when the number of lymph nodes dissected was less than 14, the performance of modified N stage was still higher than that of AJCC N stage, although it was reduced. In addition, it should not be ignored that the number of lymph nodes examined has also increased with the development of surgical instruments and the improvement of surgeons’ skills. Studies from several years ago showed that the median number of lymph nodes removed was between 6 and 1322,38,40. Another restriction of the LNR is that patients with the same LNR value may have considerable heterogeneity when all lymph nodes detected are metastatic41, such as the difference in prognosis between a patient with 2 out of 2 lymph nodes examined that are positive and a patient with a total of 15 lymph nodes examined that are all positive. However, in this study, the median number of lymph node dissection in patients included in both the SEER database cohort and the external validation cohort was 18, so the adverse effect of insufficient lymph node dissection on the performance of LNR is constantly being eliminated. The reason for the difference between the optimal cut-off value for the number of lymph nodes dissected in the present study and that in previous studies is assumed to be as follows: The time span of the data included in the study is relatively new, and the advances in surgical techniques and pathological testing methods have resulted in a higher number of lymph nodes detected than in previous studies, and therefore a higher cut-off value. In terms of statistical methods, this study used the Log-rank test oriented ‘maxstat’ method to determine the critical value, instead of the traditional quartiles or means, which is more focused on the prognostic relevance.

There are some limitations to this study, firstly as a retrospective study there may be some selection bias. Secondly, the external validation cohort was only from a single medical center and lacked multicenter or international cohort validation. Third, this study only discussed the direction of staging improvement and recommendations, but further studies are needed to confirm the details and performance of the new modified stage. In the future, the generalizability of LNR staging still needs to be validated in larger samples. And machine learning can also be used to construct a comprehensive prognostic model including LNR, LODDS, TD and lymph node location.

In conclusion, LNR is an excellent metastatic lymph node stratification pattern, its performance is better than the number of positive lymph nodes alone, and under the condition of similar performance to LODDS, the calculation difficulty is lower, thus it has more potential for clinical application.