Table 9 Discretization of continuous features.

From: Machine Learning (ML) based-method applied in recurrent pregnancy loss (RPL) patients diagnostic work-up: a potential innovation in common clinical practice

Risk Classes

0

1

2

3

4

NaN for each feature

Age (years)

18–27

28–32

33–37

38–42

>42

14

APCR (%)

2.4–3.5

1.84–2.39 & 3.51–4.05

1.28–1.83 & 4.06–4.61

0.16–1.27 & 4.62–5.73

<0.16 & ≥ 5.74

393

Protein C (%)

70–130

55–69.99 & 130.01–144.99

40–54.99 & 145–159.99

25–39.99 & 160–175

<25 & ≥ 175

271

Protein S (%)

53–109

38–52.99 & 109.01–123.99

23–37.99 & 124–138.99

8–22.99 & 139–154

<8 & ≥ 154

263

AT III (%)

80–120

65–79.99 & 120.01–134.99

50–64.99 & 135–149.99

35–49.99 & 150–165

<35 & ≥ 165

364

Omocysteinemia (µM)

5–12

4–4.99 & 12.01–14.99

3–3.99 & 15–17.99

2–2.99 & 18–21

<2 & ≥ 21

272

TSH (mU/ml)

0.5–3.8

0.4–0.49 & 3.81–4.79

0.3–0.39 & 4.8–7.79

0.1–0.29 & 7.8–17.99

<0.1 & ≥ 18

200

BMI (Kg/m2)

18.5–24.99

16.5–18.49

<16.5

25–30

>30

143

  1. Continuous features were discretized in 5 risk classes ranging from normal (risk class 0) up to high risk (risk class 4). The last column shows the number of NaN (Not a Number) for each feature present in the dataset. The NaN were set to 0 because they corresponded to the case in which the doctor found it was not necessary that exam for that patient and so the feature was in the normal range (see Materials and Methods, Discretization of continuous features).