Table 1 Overview of tasks in the DRAGON benchmark
ID | Name | Task type | Metric | Number of development cases | Number of testing cases |
---|---|---|---|---|---|
T1 | Adhesion presence | SL Bin Clf | AUROC | 397 | 166 |
T2 | Pulmonary nodule presence | SL Bin Clf | AUROC | 1000 | 200 |
T3 | Kidney abnormality identification | SL Bin Clf | AUROC | 417 | 183 |
T4 | Skin histopathology case selection | SL Bin Clf | AUROC | 531 | 225 |
T5 | RECIST timeline | SL Bin Clf | AUROC | 278 | 119 |
T6 | Histopathology cancer origin | SL Bin Clf | AUROC | 715 | 304 |
T7 | Pulmonary nodule size presence | SL Bin Clf | AUROC | 348 | 66 |
T8 | PDAC size presence | SL Bin Clf | AUROC | 418 | 179 |
T9 | PDAC diagnosis | SL MC Clf | Unweighted Kappa | 1374 | 588 |
T10 | Prostate radiology suspicious lesions | SL MC Clf | Linearly Weighted Kappa | 5111 | 2229 |
T11 | Prostate histopathology significant cancers | SL MC Clf | Linearly Weighted Kappa | 2213 | 952 |
T12 | Histopathology tissue type | SL MC Clf | Unweighted Kappa | 707 | 304 |
T13 | Histopathology tissue origin | SL MC Clf | Unweighted Kappa | 718 | 297 |
T14 | Entailment diagnostic sentences | SL MC Clf | Linearly Weighted Kappa | 12,627 | 1422 |
T15 | Colon histopathology diagnosis | ML Bin Clf | Macro AUROC | 2748 | 1177 |
T16 | RECIST lesion size presence | ML Bin Clf | AUROC | 278 | 119 |
T17 | PDAC attributes | ML MC Clf | Unweighted Kappa | 418 | 179 |
T18 | Hip Kellgren-Lawrence scoring | ML MC Clf | Unweighted Kappa | 4803 | 172 |
T19 | Prostate volume measurement | SL Reg | RSMAPES (ε = 4 cm3) | 5138 | 2170 |
T20 | Prostate specific antigen measurement | SL Reg | RSMAPES (ε = 0.4 ng/mL) | 4759 | 2046 |
T21 | Prostate specific antigen density measurement | SL Reg | RSMAPES (ε = 0.04 ng/mL2) | 4700 | 2020 |
T22 | PDAC size measurement | SL Reg | RSMAPES (ε = 4 mm) | 343 | 147 |
T23 | Pulmonary nodule size measurement | SL Reg | RSMAPES (ε = 4 mm) | 186 | 32 |
T24 | RECIST lesion size measurements | ML Reg | RSMAPES (ε = 4 mm) | 278 | 119 |
T25 | Anonymization | SL NER | Macro F1 | 3078 | 1307 |
T26 | Medical terminology recognition | SL NER | F1 | 175 | 75 |
T27 | Prostate biopsy sampling | ML NER | Weighted F1 | 349 | 146 |
T28 | Skin histopathology diagnosis | ML NER | Weighted F1 | 439 | 185 |