Table 1 Comparison of diagnostic accuracy with state-of-the-art methods
From: Interactive computer-aided diagnosis on medical image using large language models
Observation | CvT2DistilGPT2 | R2GenCMN | PCAM | Ours (GPT-3) | Ours (ChatGPT) | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PR | RC | F1 | PR | RC | F1 | PR | RC | F1 | PR | RC | F1 | PR | RC | F1 | ||
Cardiomegaly | 0.512 | 0.591 | 0.549 | 0.590 | 0.534 | 0.561 | 0.846 | 0.190 | 0.310 | 0.606 | 0.569 | 0.587 | 0.663 | 0.595 | 0.627 | |
Edema | 0.224 | 0.468 | 0.303 | 0.563 | 0.252 | 0.348 | 0.602 | 0.579 | 0.591 | 0.563 | 0.626 | 0.593 | 0.556 | 0.514 | 0.534 | |
Consolidation | 0.063 | 0.239 | 0.099 | 0.667 | 0.121 | 0.205 | 0.325 | 0.788 | 0.460 | 0.310 | 0.803 | 0.447 | 0.322 | 0.697 | 0.440 | |
Atelectasis | 0.306 | 0.388 | 0.342 | 0.442 | 0.504 | 0.471 | 0.468 | 0.991 | 0.636 | 0.408 | 0.991 | 0.578 | 0.470 | 0.981 | 0.636 | |
Pleural Effusion | 0.454 | 0.692 | 0.548 | 0.819 | 0.500 | 0.618 | 0.728 | 0.916 | 0.811 | 0.634 | 0.916 | 0.749 | 0.736 | 0.845 | 0.787 | |
Average | 0.312 | 0.476 | 0.368 | 0.616 | 0.382 | 0.441 | 0.594 | 0.693 | 0.562 | 0.504 | 0.781 | 0.591 | 0.549 | 0.726 | 0.605 |