Extended Data Fig. 4: Comparison of abnormality-based performance metrics in the Rad-ChestCT validation set.
From: Generalist foundation models from a multimodal dataset for 3D computed tomography

This figure provides a detailed analysis of performance metrics, including AUROC, accuracy, precision, and F1 scores, for detecting various abnormalities with our models in the Rad-ChestCT validation set, compared to the fully supervised baseline model. It highlights the models’ remarkable adaptability and superior effectiveness with distribution shifts, setting a new standard in performance compared to a fully supervised baseline model.