Table 4 Summary of the IDH mutation status classification performance of different models trained on real or synthetic MRI data
From: Categorical and phenotypic image synthetic learning as an alternative to federated learning
Experiments | Training data | Metrics | UCSF | NYU | UWM | UPenn | UTSWp2 | Overall | McNemar Test |
|---|---|---|---|---|---|---|---|---|---|
Centralized Training | Real TCGA + Real UTSW + Real EGD | ACC | 96.2 | 94.5 | 95.9 | 97.9 | 94.5 | 96.2 | - |
SEN (MT) | 88.4 | 87.5 | 77.8 | 72.7 | 88.0 | 86.5 | |||
SPE (WT) | 98.2 | 97.0 | 97.5 | 98.5 | 97.4 | 98.0 | |||
AUC | 0.980 | 0.966 | 0.969 | 0.978 | 0.992 | 0.979 | |||
FL using the FedAvg algorithm with 100 FL rounds | Real TCGA + Real UTSW + Real EGD | ACC | 95.6 | 92.8 | 95.4 | 97.9 | 95.1 | 95.8 | χ²(1) = 2.45, p = 0.1175 |
SEN (MT) | 89.3 | 85.4 | 77.8 | 63.6 | 92.0 | 87.0 | |||
SPE (WT) | 97.2 | 95.5 | 97.0 | 98.8 | 96.5 | 97.4 | |||
AUC | 0.983 | 0.968 | 0.961 | 0.975 | 0.988 | 0.980 | |||
Centralized Training using Synthetic Samples | Synthetic TCGA + Synthetic UTSW + Synthetic EGD | ACC | 94.8 | 92.8 | 95.0 | 98.1 | 94.5 | 95.5 | χ²(1) = 1.31, p = 0.2531 |
SEN (MT) | 83.5 | 91.7 | 88.9 | 72.7 | 88.0 | 86.1 | |||
SPE (WT) | 97.7 | 93.2 | 95.5 | 98.8 | 97.4 | 97.2 | |||
AUC | 0.972 | 0.952 | 0.962 | 0.960 | 0.962 | 0.966 |