Table 1 Comparison of the general features of the training and test sets.
From: Application of machine learning algorithm in predicting distant metastasis of T1 gastric cancer
Variable | Training set (%) N = 1889 | Test set (%) N = 809 | P value | Validation set N = 107 |
|---|---|---|---|---|
Age(years) | 0.7953 | |||
< 40 | 59 (3.12%) | 25 (3.09%) | 8 | |
40–60 | 405 (21.44%) | 178 (22.00%) | 67 | |
60–80 | 971 (51.41%) | 426 (52.66%) | 31 | |
> 80 | 454 (24.03%) | 180 (22.25%) | 1 | |
Sex | 0.2379 | |||
Male | 1050 (55.58%) | 429 (53.03%) | 71 | |
Female | 839 (44.42%) | 380 (46.97%) | 36 | |
T stage | 0.3458 | |||
T1a | 1054 (55.80%) | 468 (57.85%) | 40 | |
T1b | 835 (44.20%) | 341 (42.15%) | 67 | |
N stage | 0.3622 | |||
N0 | 1553 (82.21%) | 671 (82.94%) | 81 | |
N1 | 231 (12.23%) | 96 (11.87%) | 21 | |
N2 | 80 (4.24%) | 26 (3.21%) | 4 | |
N3 | 25 (1.32%) | 16 (1.98%) | 1 | |
M stage | 1 | |||
M0 | 1669 (88.35%) | 715 (88.38%) | 93 | |
M1 | 220 (11.65%) | 94 (11.62%) | 14 | |
Tumor size(cm) | 0.6843 | |||
< 2 | 653 (34.57%) | 276 (34.12%) | 37 | |
02-May | 589 (31.18%) | 261 (32.26%) | 64 | |
> 5 | 132 (6.99%) | 47 (5.81%) | 6 | |
NA | 515 (27.26%) | 225 (27.81%) | 0 | |
Differentiation | 0.081 | |||
Well | 233 (12.33%) | 80 (9.89%) | 1 | |
Moderate | 523 (27.69%) | 230 (28.43%) | 37 | |
Poorly | 853 (45.16%) | 356 (44.00%) | 63 | |
Undifferentiated | 28 (1.48%) | 21 (2.60%) | 6 | |
NA | 252 (13.34%) | 122 (15.08%) | 0 | |
Primary site | 0.9185 | |||
Fundus | 92 (4.78%) | 35 (4.33%) | 12 | |
Body | 303 (16.04%) | 130 (16.07%) | 29 | |
Antrum | 713 (37.74%) | 306 (37.82%) | 37 | |
Pylorus | 63 (3.34%) | 35 (4.33%) | 4 | |
Lesser curve | 233 (12.33%) | 101 (12.48%) | 12 | |
Greater curve | 99 (5.24%) | 46 (5.69%) | 1 | |
Overlapping | 135 (7.15%) | 52 (6.43%) | 12 | |
NOS | 251 (13.29%) | 104 (12.86%) | 0 |