Table 1 Model performance for each experiment
Experiment | Revisit cycle | Error | Accuracy | Sensitivity | Specificity | F1- score | AUROC | Best cyclea |
|---|---|---|---|---|---|---|---|---|
Pre-trained model | n/a | n/a | 67% | 72% | 63% | 67% | 73% | 10 |
Clean baseline | 75% (71–75) | 75% (71–75) | 75% (70–75) | 73% (69–73) | 82% (79–82) | 26 (23–29) | ||
Dirty baseline 1st scenario | 54% | 0% | 100% | 0% | 49% | 11 | ||
Dirty baseline 2nd scenario | 54% | 0% | 100% | 0% | 45% | 11 | ||
1st scenario | 2 | 2% | 73% | 70% | 75% | 70% | 80% | 25 |
2 | 3% | 70% | 75% | 65% | 70% | 78% | 20 | |
2 | 4% | 71% | 64% | 77% | 67% | 77% | 22 | |
2 | 5% | 71% | 60% | 80% | 66% | 78% | 20 | |
5 | 2% | 69% | 65% | 72% | 66% | 73% | 20 | |
5 | 3% | 71% | 63% | 78% | 67% | 78% | 20 | |
5 | 4% | 71% | 63% | 77% | 67% | 77% | 22 | |
5 | 5% | 71% | 64% | 77% | 67% | 77% | 22 | |
2nd scenario | 2 | 2% | 72% | 73% | 70% | 70% | 78% | 21 |
2 | 3% | 72% | 72% | 72% | 70% | 78% | 26 | |
2 | 4% | 55% | 9% | 91% | 16% | 48% | 12 | |
2 | 5% | 54% | 0% | 100% | 0% | 47% | 11 | |
5 | 2% | 72% | 78% | 67% | 72% | 78% | 26 | |
5 | 3% | 70% | 71% | 70% | 69% | 78% | 26 | |
5 | 4% | 55% | 9% | 91% | 16% | 48% | 12 | |
5 | 5% | 54% | 0% | 100% | 0% | 47% | 11 | |
Real 83 centers | 2 | 2% | 61% | 30% | 87% | 41% | 66% | 15 |
2 | 3% | 72% | 71% | 73% | 71% | 78% | 22 |