Table 1 Baseline characteristics of the enrolled individuals in the derivation cohort and temporal validation cohort dataset

From: Development and validation of machine learning models for young-onset colorectal cancer risk stratification

Characteristics

All patients (n = 10,874)

Derivation cohort (n = 8820)

Temporal validation cohort (n = 2054)

YOCRC

477 (4.4%)

399 (4.5%)

78 (3.8%)

Sociodemographic

Age (y)

43 (36–48)

43 (36–48)

43 (35–47)

Male, n (%)

6076 (55.9%)

4976 (56.4%)

1100 (53.6%)

Height (m)

1.65 (1.60–1.72)

1.65 (1.60–1.72)

1.65 (1.59–1.72)

Weight (kg)

63 (55–72)

62.5 (55–72)

64 (55–74)

BMI (kg/m²)

22.49 (20.20–25.06)

22.49 (20.20–25.04)

22.51 (20.31–25.06)

Smoking, n (%)

705 (6.5%)

476 (5.4%)

229 (11.1%)

Alcohol intake, n (%)

534 (4.9%)

357 (4.0%)

177 (8.6%)

Family history of CRC, n (%)

100 (0.9%)

70 (0.8%)

30 (1.5%)

Comorbidity

DM, n (%)

205 (1.9%)

178 (2.0%)

27 (1.3%)

Hyperlipidemia, n (%)

194 (1.8%)

156 (1.8%)

38 (1.9%)

Hypertension, n (%)

580 (5.3%)

471 (5.3%)

109 (5.3%)

Cardiac failure, n (%)

54 (0.5%)

40 (0.5%)

14 (0.7%)

CKD, n (%)

18 (0.2%)

16 (0.2%)

2 (0.1%)

History of stroke, n (%)

34 (0.3%)

30 (0.3%)

4 (0.2%)

Medications

Aspirin, n (%)

46 (0.4%)

35 (0.4%)

11 (0.5%)

Statin, n (%)

34 (0.3%)

25 (0.3%)

9 (0.4%)

Symptoms

   

Hematochezia, n (%)

925 (8.5%)

778 (8.8%)

147 (7.2%)

Altered bowel habit, n (%)

958 (8.8%)

764 (8.7%)

194 (9.4%)

Diarrhoea, n (%)

658 (6.1%)

526 (6.0%)

132 (6.4%)

Constipation, n (%)

161 (1.5%)

131 (1.5%)

30 (1.5%)

Abdominal pain, n (%)

1360 (12.5%)

1108 (12.6%)

252 (12.3%)

Abdominal distension, n (%)

928 (8.5%)

710 (8.0%)

218 (10.6%)

Abdominal mass, n (%)

14 (0.1%)

13 (0.2%)

1 (0.05%)

Anorexia, n (%)

9 (0.1%)

7 (0.1%)

2 (0.1%)

Unexplained weight loss, n (%)

8 (0.1%)

7 (0.1%)

1 (0.05%)

Laboratory data

WBC, 109/L

5.80 (4.69–7.10)

5.79 (4.68–7.11)

5.84 (4.74–7.10)

Neutrophils, 109/L

3.37 (2.58-4.39)

3.36 (2.56-4.39)

3.43 (2.63-4.35)

Lymphocytes, 109/L

1.66 (1.27–2.08)

1.66 (1.26–2.07)

1.69 (1.31–2.10)

Monocytes, 109/L

0.44 (0.34–0.56)

0.43 (0.34–0.56)

0.45 (0.35–0.56)

Eosinophiles, 109/L

0.09 (0.05–0.16)

0.09 (0.05–0.16)

0.10 (0.05–0.17)

PLT, 109/L

226 (185–273)

225 (185–272)

228 (187.75–276)

Hemoglobin, g/L

135 (120–150)

135 (120–149)

136 (122–150)

RDW

0.13 (0.12–0.14)

0.13 (0.12–0.14)

0.13 (0.12–0.14)

MCV, fL

89.80 (86.60–92.80)

89.80 (86.60–92.80)

89.70 (86.70–92.90)

CRP, mg/L

0.54 (0.50–5.00)

0.64 (0.50–5.00)

0.50 (0.50–3.09)

ALT, U/L

19 (13–30)

19 (13–30)

19 (13–30)

AST, U/L

21 (17–27)

21 (17–27)

21 (17–26)

GGT, mmol/L

20 (12–35)

20 (12–36)

20 (12–34.75)

TBil, mmol/L

11.52 (8.57–15.64)

11.50 (8.56–15.60)

11.60 (8.61–15.82)

DBil, mmol/L

3.40 (2.50–4.70)

3.40 (2.50–4.60)

3.40 (2.50–4.80)

TPO, g/L

69.20 (65.30–73)

69.30 (65.40–73.10)

68.80 (65–72.60)

Albumin, g/L

43.80 (41.12–46.05)

43.80 (41.10–46.00)

43.80 (41.21–46.10)

A/G

1.71 (1.54–1.90)

1.71 (1.54–1.89)

1.74 (1.56–1.93)

ALP, mmol/L

67 (55–83)

67 (55–83)

66.60 (54.58–82.60)

LDH

179 (159–207)

179 (159–208)

179 (158–206)

TG, mmol/L

1.34 (0.91–2.10)

1.35 (0.92–2.09)

1.34 (0.90–2.15)

Total cholesterol, mmol/L

4.28 (3.72–4.92)

4.29 (3.72–4.92)

4.25 (3.71–4.90)

HDL, mmol/L

1.12 (0.93–1.36)

1.12 (0.94–1.36)

1.12 (0.93–1.35)

LDL, mmol/L

2.36 (1.91–2.90)

2.37 (1.90–2.90)

2.36 (1.94–2.92)

ApoA1, g/L

1.29 (1.15–1.45)

1.29 (1.15-1.45)

1.29 (1.16–1.46)

ApoB, g/L

0.79 (0.66–0.93)

0.79 (0.66–0.93)

0.78 (0.66–0.93)

Lipoprotein (a), g/L

102 (48–230)

103.10 (48.20–238)

106 (51–234.75)

Blood glucose, mmol/L

4.95 (4.54–5.52)

4.95 (4.56–5.51)

4.94 (4.48–5.54)

CEA, ng/mL

1.09 (0.50–2.27)

1.10 (0.50–2.31)

1.04 (0.50–2.11)

Positive FIT, n (%)

668 (15.4%)

560 (15.4%)

108 (15.6%)

  1. Note: Features with too many missing values such as CA-199, CA-724, etc. were not displayed in the above table; the statistics values of FIT were calculated by the data before missing data imputation (we also deleted the missing values for FIT in this calculation process).
  2. BMI body mass index, CRC colorectal cancer, DM diabetes mellitus, CKD chronic kidney disease, WBC white blood cells, PLT platelets, CRP C-reactive protein, RDW Red cell distribution width, MCV mean corpuscular volume, ALT alanine aminotransferase, AST aspartate aminotransferase, GGT gamma glutamyl transpeptidase, TBil total bilirubin, DBil direct bilirubin, TPO total protein, A/G albumin/globulin ratio, ALP alkaline phosphatase, LDH lactate dehydrogenase, TG triglycerides, HDL high-density lipoprotein, LDL low-density lipoprotein, ApoA1 Apolipoprotein A1, ApoB Apolipoprotein B, CEA carcinoembryonic antigen, FIT fecal immunochemical test.