Table 1 ML algorithm and evaluation metrics per technical task are presented

From: Advancing breast, lung and prostate cancer research with federated learning. A systematic review

Technical task

ML algorithm

Evaluation

Studies

Classification

Classic ML

ROC

14,16,27

Accuracy

17,29

Other (MCC, Precision, Recall, F1)

17,30

CNN

ROC

15,32,36

Other (Precision-Recall AUC, F1, Accuracy)

32,36

Large Pre-trained

ROC

19,23,35

F1 score

24,38

Accuracy

23,24,33,35,38

Other (Confusion matrices, Precision, Recall, Kappa coefficient)

24,33,35,38

GAN

ROC

22,24

Detection

Large Pre-trained

ROC

20,23,34

Other

Other (Precision, F1, MCC)

34

Segmentation

UNet

Dice coefficient

18,25,31,37

Other (IoU)

18,37

Regression

Other (CNN)

Other (Dice coefficient, Accuracy)

26,28

  1. Publications occurring in multiple cells represent overlapping tasks and/or evaluation metrics. In papers where multiple ( > 1) tasks were involved, we only present the ML model for which evaluation metrics are reported. Note that we refer to technical tasks, ML algorithms and evaluation metrics, as reported in each publication. The term “Accuracy” presents a standalone metric (separate to ROC analysis), as reported by each paper authors. The term “Other” represents metrics that occurred once per ML model/technical task. ML: machine learning; ROC: receiver operating characteristic curve; MCC: Matthew’s correlation coefficient; AUC: area under the curve; IoU: intersection-over-union.