Table 1 Detailed breakdown of full dataset including “SEED” and “EXT” by ground truth class and characteristics (study, device, and geography), highlighting both the number of images and relative percentage.
From: Generalizable deep neural networks for image quality classification of cervical images
Dataset characteristics | Ground truth categories no. (%) | Grand total by dataset characteristics no. (%) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Study | Device | Geography | Low Quality (n = 5902) | Intermediate (n = 10,553) | High Quality (n = 24,704) | Total (n = 41,159) | ||||
# images | # images | # images | # images | |||||||
“SEED” dataset (Model Development and Internal Validation) | ||||||||||
NHS | Cervigram | Costa Rica | 508 | (8.8%) | 0 | (0.0%) | 9901 | (40.4%) | 10,409 | (25.7%) |
ALTS | Cervigram | USA | 418 | (7.2%) | 0 | (0.0%) | 4027 | (16.4%) | 4445 | (11.0%) |
CVT | DSLR | Costa Rica | 1160 | (20.0%) | 1391 | (13.6%) | 2104 | (8.6%) | 4655 | (11.5%) |
Biop | DSLR | USA | 328 | (5.7%) | 826 | (8.1%) | 524 | (2.1%) | 1678 | (4.1%) |
D Biop | DSLR | Europe | 0 | (0.0%) | 423 | (4.1%) | 749 | (3.1%) | 1172 | (2.9%) |
Itoju | DSLR | Nigeria | 548 | (9.5%) | 1139 | (11.1%) | 3835 | (15.7%) | 5522 | (13.6%) |
Itoju | J5 | Nigeria | 678 | (11.7%) | 1469 | (14.3%) | 2427 | (9.9%) | 4574 | (11.3%) |
Itoju | S8 | Nigeria | 2150 | (37.1%) | 4996 | (48.8%) | 933 | (3.8%) | 8079 | (19.9%) |
Total | 5790 | (100.0%) | 10,244 | (100.0%) | 24,500 | (100.0%) | 40,534 | (100.0%) | ||
“EXT” dataset (External Validation) | ||||||||||
PAVE | IRIS | Cambodia | 116 | (45.8%) | 224 | (35.4%) | 165 | (36.3%) | 505 | (37.7%) |
PAVE | IRIS | DR | 137 | (54.2%) | 409 | (64.6%) | 289 | (63.7%) | 835 | (62.3%) |
Total | 253 | (100.0%) | 633 | (100.0%) | 454 | (100.0%) | 1340 | (100.0%) | ||
Grand total by ground truth | ||||||||||
no. (%) | 6043 | 10,877 | 24,954 | 41,874 | ||||||
(14.4%) | (26.0%) | (59.6%) | (100.0%) |