Table 9 Discretization of continuous features.

Risk Classes	0	1	2	3	4	NaN for each feature
Age (years)	18–27	28–32	33–37	38–42	>42	14
APCR (%)	2.4–3.5	1.84–2.39 & 3.51–4.05	1.28–1.83 & 4.06–4.61	0.16–1.27 & 4.62–5.73	<0.16 & ≥ 5.74	393
Protein C (%)	70–130	55–69.99 & 130.01–144.99	40–54.99 & 145–159.99	25–39.99 & 160–175	<25 & ≥ 175	271
Protein S (%)	53–109	38–52.99 & 109.01–123.99	23–37.99 & 124–138.99	8–22.99 & 139–154	<8 & ≥ 154	263
AT III (%)	80–120	65–79.99 & 120.01–134.99	50–64.99 & 135–149.99	35–49.99 & 150–165	<35 & ≥ 165	364
Omocysteinemia (µM)	5–12	4–4.99 & 12.01–14.99	3–3.99 & 15–17.99	2–2.99 & 18–21	<2 & ≥ 21	272
TSH (mU/ml)	0.5–3.8	0.4–0.49 & 3.81–4.79	0.3–0.39 & 4.8–7.79	0.1–0.29 & 7.8–17.99	<0.1 & ≥ 18	200
BMI (Kg/m²)	18.5–24.99	16.5–18.49	<16.5	25–30	>30	143

Continuous features were discretized in 5 risk classes ranging from normal (risk class 0) up to high risk (risk class 4). The last column shows the number of NaN (Not a Number) for each feature present in the dataset. The NaN were set to 0 because they corresponded to the case in which the doctor found it was not necessary that exam for that patient and so the feature was in the normal range (see Materials and Methods, Discretization of continuous features).

Quick links

Search