Table 3 Results of the comparison between local and FFL-based training for 5 different datasets.
From: Collaborative training of medical artificial intelligence models with non-uniform labels
Dataset name | Training set size | Included labels | Training setup | AUROC | P-value |
|---|---|---|---|---|---|
VinDr-CXR | n = 15,000 | No finding, aortic enlargement, pleural thickening, cardiomegaly, pleural effusion | Local | 0.867 ± 0.045 | 0.001 |
FFL | 0.885 ± 0.049 | ||||
ChestX-ray14 | n = 83,525 | Cardiomegaly, lung opacity, lung lesion, pneumonia, edema | Local | 0.744 ± 0.076 | 0.363 |
FFL | 0.744 ± 0.080 | ||||
CheXpert | n = 126,141 | Cardiomegaly, lung opacity, lung lesion, pneumonia, edema | Local | 0.796 ± 0.064 | 0.243 |
FFL | 0.797 ± 0.061 | ||||
MIMIC-CXR-JPG-v2.0 | n = 237,972 | Enlarged cardiomediastinum, consolidation, pleural effusion, pneumothorax, atelectasis | Local | 0.772 ± 0.072 | 0.004 |
FFL | 0.786 ± 0.066 | ||||
UKA-CXR | n = 122,297 | Pleural effusion left, pleural effusion right, cardiomegaly, pneumonic infiltrates left, pneumonic infiltrates right | Local | 0.916 ± 0.031 | 0.001 |
FFL | 0.918 ± 0.031 |