Table 4 Comparison of mean Average Precision (mAP) scores (%) for detection performance on the RSNA dataset with varying ratios of annotated samples

From: Enhancing representation in radiography-reports foundation model: a granular alignment algorithm using masked contrastive learning

Method

Backbone

RSNA

  

10%

100%

ImageNet

ResNet

12.4

8.0

BYOL

ResNet

11.0

17.3

SimCLR

ResNet

12.2

18.8

PixelPro

ResNet

11.0

17.4

CLIP

ResNet

10.7

19.9

LoVT

ResNet

13.2

18.1

CLIP*

VIT

10.6

17.4

MaCo (Ours)

VIT

11.9

19.2

  1. *indicates our reimplemented CLIP version with the VIT backbone.