Table 1 Comparison of our proposed RadFM with other foundation models on nine existing datasets, together with ablation studies
From: Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data
Dataset | Metric | OpenFlamingo (Few-shot) | MedVInT | LLaVA-Med | MedFlamingo (Few-shot) | RadFM (w/o Ins-tuning) | RadFM (w/o Our Data) | RadFM |
|---|---|---|---|---|---|---|---|---|
Disease diagnosis | ||||||||
VinDr-Mammo | ACC | 49.92 (48.20, 51.65) | 50.06 (48.52, 51.59) | 50.27 (49.20, 51.52) | 49.80 (48.33, 51.42) | 49.80 (48.15,51.53) | 55.35 (54.35,559.47) | 59.96 (58.41, 61.59) |
F1 | 57.01 (54.64, 60.08) | 66.56 (65.2, 67.93) | 56.48 (55.45, 57.73) | 64.92 (63.52, 66.32) | 60.32 (58.25,62.31) | 60.57 (58.75,62.58) | 62.11 (60.09, 63.75) | |
VinDr-SpineXr | ACC | 50.33 (47.13, 53.53) | 49.93 (46.99, 52.86) | 49.85 (47.95, 52.23) | 49.61 (46.05, 53.16) | 52.19 (49.18,55.17) | 64.43 (61.76,67.22) | 68.82 (65.92, 71.47) |
F1 | 31.79 (26.99, 36.58) | 62.32 (59.38, 65.25) | 54.83 (51.88, 57.45) | 63.23 (59.74, 66.74) | 34.19 (30.17,38.09) | 65.86 (62.62,68.81) | 67.69 (64.5, 70.98) | |
VinDr-PCXR | ACC | 49.85 (45.40, 54.31) | 50.29 (45.88, 54.69) | 49.62 (45.79, 53.64) | 49.37 (44.44, 54.31) | 50.12 (45.21, 54.60) | 51.82 (46.46, 57.09) | 56.32 (51.82, 61.21) |
F1 | 41.44 (33.77, 49.10) | 66.29 (62.36, 70.23) | 47.81 (42.33, 53.42) | 66.94 (62.57, 71.32) | 43.33 (40.37, 40.88) | 49.14 (43.66, 56.18) | 37.53 (28.88, 43.67) | |
CXR-Mix | ACC | 50.63 (50.07, 51.03) | 49.2 (48.53, 49.88) | 53.26 (52.72, 53.91) | 50.00 (49.50, 50.51) | 77.71 (77.25, 77.95) | 78.63 (78.51, 79.10) | 83.62 (83.23, 83.97) |
F1 | 24.83 (24.11, 25.54) | 67.22 (66.62, 67.82) | 22.63 (22.70, 24.53) | 66.11 (65.72, 66.61) | 74.42 (73.98, 75.01) | 78.35 (77.85, 78.93) | 82.99 (82.58, 83.49) | |
RadChest-CT | ACC | 50.93 (49.13, 52.72) | 50.07 (47.68, 52.45) | 51.09 (50.05, 52.63) | 50.39 (48.34, 52.43) | 51.97 (50.05, 53.31) | 69.72 (67.44, 71.53) | 72.95 (71.06, 74.78) |
F1 | 43.49 (41.18, 45.99) | 66.57 (64.45, 68.69) | 44.42 (42.00, 46.55) | 63.31 (61.39, 65.23) | 38.67 (36.37, 41.46) | 67.84 (65.64, 70.11) | 71.86 (69.42, 83.49) | |
Medical VQA | ||||||||
PMC-VQA | BLEU | 11.10 (8.93, 13.41) | 23.73 (21.03, 26.73) | 13.66 (11.68, 15.52) | 11.03 (9.27, 13.49) | 5.23 (3.23, 8.84) | 14.01 (10.92, 17.25) | 17.99 (14.80, 20.83) |
ROUGE | 13.03 (10.63, 15.46) | 27.24 (24.04, 30.91) | 18.14 (16.46, 20.20) | 13.06 (10.93, 15.66) | 5.82 (2.03, 10.09) | 14.23 (11.20, 17.66) | 19.43 (16.56, 23.55) | |
UMLS_Precision | 7.60 (5.41, 10.83) | 19.64 (16.2, 23.59) | 16.38 (12.67, 20.25) | 6.45 (4.05, 8.97) | 18.63 (14.84, 20.76) | 13.24 (9.90, 17.02) | 20.74 (17.39, 24.71) | |
UMLS_Recall | 7.56 (5.40, 10.51) | 18.88 (15.51, 22.68) | 13.34 (10.59, 16.07) | 6.10 (4.04, 8.97) | 15.03 (12.07, 18.34) | 12.94 (9.39, 15.86) | 14.14 (11.19, 17.37) | |
BERT-Sim | 52.08 (50.43, 54.07) | 57.81 (55.49, 59.76) | 42.46 (41.50, 43.44) | 51.37 (49.57, 53.01) | 47.85 (44.20, 49.37) | 57.57 (55.85, 60.19) | 63.85 (62.04, 65.94) | |
VQA-RAD | BLEU | 33.98 (26.75, 41.85) | 35.1 (28.44, 41.55) | 31.55 (24.89, 38.35) | 35.97 (29.14, 45.45) | 22.03 (15.67, 30.38) | 43.98 (36.58, 50.51) | 52.24 (44.97, 59.43) |
ROUGE | 35.26 (28.21, 43.91) | 39.2 (31.36, 46.33) | 37.47 (30.83, 44.47) | 38.64 (31.42, 48.23) | 22.67 (14.92, 28.57) | 44.70 (38.35, 50.81) | 52.74 (45.39, 61.05) | |
UMLS_Precision | 14.72 (6.86, 24.22) | 16.46 (7.83, 25.93) | 13.30 (12.14, 14.50) | 18.70 (8.76, 29.61) | 60.30 (50.88, 67.07) | 61.52 (53.65, 69.51) | 62.12 (54.01, 71.12) | |
UMLS_Recall | 14.52 (7.63, 23.33) | 15.94 (7.72, 25.48) | 12.16 (10.09, 13.93) | 17.46 (8.76, 27.85) | 39.43 (32.59, 47.12) | 41.14 (34.49, 48.76) | 42.82 (32.31, 51.54) | |
BERT-Sim | 71.49 (67.63, 74.96) | 71.39 (66.94, 75.46) | 68.28 (64.07, 72.00) | 73.40 (69.62, 77.32) | 58.88 (56.74, 61.08) | 80.64 (77.55, 83.89) | 81.52 (77.41, 85.17) | |
SLAKE | BLEU | 27.16 (22.01, 32.56) | 24.81 (20.23, 30.52) | 21.43 (17.07, 25.35) | 23.62 (18.06, 28.26) | 24.39 (15.81, 30.74) | 67.44 (63.74, 71.68) | 78.56 (72.2, 83.28) |
ROUGE | 29.36 (24.23, 34.73) | 29.08 (24.06, 34.8) | 29.92 (25.31, 34.09) | 24.86 (19.47, 29.94) | 24.81 (16.93, 30.59) | 67.90 (63.58, 74.28) | 79.42 (75.15, 84.05) | |
UMLS_Precision | 23.02 (17.52, 30.73) | 23.32 (18.08, 29.42) | 23.14 (18.29, 28.86) | 18.28 (13.23, 23.38) | 68.87 (64.43, 73.27) | 76.09 (71.63, 80.21) | 81.5 (76.81, 86.87) | |
UMLS_Recall | 22.71 (17.48, 29.53) | 23.74 (18, 30.08) | 23.31 (18.29, 27.98) | 19.21 (13.38, 24.37) | 57.38 (52.49, 63.66) | 72.04 (67.59, 76.36) | 74.42 (66.7, 81.19) | |
BERT-Sim | 69.42 (66.09, 72.04) | 67.7 (64.94, 70.69) | 69.14 (66.53, 70.92) | 66.93 (63.98, 70.32) | 62.35 (61.15, 63.66) | 90.93 (89.46, 92.30) | 93.30 (90.99, 95.60) | |
Report generation | ||||||||
MIMIC-CXR | BLEU | 23.79 (22.62, 24.86) | 0.04 (0.01, 0.08) | 11.29 (9.92, 12.86) | 22.65 (20.93, 24.06) | 11.06 (8.36, 14.43) | 20.63 (17.16, 25.43) | 19.43 (16.12, 23.25) |
ROUGE | 35.83 (33.7, 37.96) | 2.69 (2.26, 3.15) | 13.91 (12.63, 15.29) | 27.29 (25.63, 29.04) | 15.05 (12.72, 19.54) | 25.42 (21.89, 29.47) | 26.18 (23.07, 29.86) | |
UMLS_Precision | 16.75 (15.74, 17.88) | 26.67 (11.19, 42.12) | 10.50 (8.42, 12.88) | 22.36 (20.13, 24.33) | 21.80 (19.26, 24.29) | 43.64 (36.96, 49.45) | 45.51 (40.47, 52.77) | |
UMLS_Recall | 24.93 (22.86, 27.38) | 0.52 (0.2, 0.88) | 10.71 (8.37, 13.85) | 19.64 (17.89, 21.43) | 15.97 (12.92, 18.48) | 22.73 (19.64, 26.57) | 23.39 (20.18, 27.53) | |
BERT-Sim | 65.91 (65.20, 66.70) | 34.48 (32.69, 36.02) | 49.20 (48.22, 50.35) | 66.03 (65.37, 66.83) | 63.13 (61.31, 64.87) | 64.22 (61.74, 65.97) | 66.77 (64.87, 68.58) | |