Table 2 Entire dataset, training sets, and validation sets for the two waves that occurred during the Brazilian COVID-19 outbreak.

From: Prediction of SARS-CoV-2-positivity from million-scale complete blood counts using machine learning

 

CBC (+)

CBC (−)

Gender

COVID-19 (+)

COVID-19 (−)

Influenza-A (+)

Influenza-B (+)

Influenza-H1N1 (+)

Other viruses (+)

Entire data

Male

11.3%

34.0%

46.7%

46.5%

48.4%

59.5%

 

(122,793)

(369,787)

(3160)

(1384)

(4108)

(20,107)

Female

10.3%

44.4%

53.3%

53.5%

51.6%

40.5%

 

(111,673)

(482,453)

(3604)

(1588)

(4380)

(13,691)

Training set: first wave data

Male

12.9%

9.8%

4.2%

2.1%

6.0%

12.8%

 

(5859)

(4469)

(1895)

(975)

(2742)

(5825)

Female

12.1%

15.2%

4.9%

2.8%

6.9%

10.3%

 

(5527)

(6918)

(2223)

(1214)

(3118)

(4656)

Validation set: first wave data

Male

4.9%

37.6%

1.0%

<0.1%

1.4%

2.3%

 

(5808)

(44,637)

(1113)

(188)

(1660)

(2710)

Female

4.7%

43.4%

1.1%

<0.1%

1.6%

1.7%

 

(5647)

(51,550)

(1343)

(134)

(1842)

(2028)

Training set: second wave data

Male

25.9%

10.5%

2.0%

1.0%

3.1%

6.3%

 

(24,104)

(9770)

(1895)

(975)

(2742)

(5825)

Female

24.2%

15.1%

2.3%

1.3%

3.3%

5.0%

 

(22,404)

(14,088)

(2223)

(1214)

(3118)

(4656)

Validation set: second wave data

Male

4.5%

38.9%

0.4%

<0.1%

0.6%

1.0%

 

(11,860)

(101,655)

(1113)

(188)

(1660)

(2710)

Female

4.3%

48.1%

0.5%

<0.1%

0.7%

0.8%

 

(11,021)

(125,776)

(1343)

(134)

(1842)

(2028)

  1. Training sets were obtained after applying the inclusion–exclusion criteria to the entire data and downsampling the COVID-19(-) class in the training sets to account for class unbalance. We considered October 1st as the split point between the first and second wave data to eliminate possible incubation periods before the start of the second wave in early November. As such, validation for the first wave encompasses data from late June to late September, and validation for the second wave ranges from early October to late February. N = 1,138,728 CBCs.