Extended Data Fig. 3: Model performance at various magnifications.
From: Deep learning-enabled assessment of cardiac allograft rejection from endomyocardial biopsies

Model performance at different magnifications scales at a. slide-level and b. patient-level. Reported are AUC-ROC curves with 95% CI for 40×, 20× and 10× computed for the US test set (n = 995 WSIs, N = 336 patients). For the rejection detection tasks, the model typically performs better at higher magnification, while the grade predictions benefit from the increased context presented at lower magnifications. To account for the information from different scales, the detection of rejections and Quilty-B lesions is performed from the fusion of the model predictions from all available scales. In comparison, the rejection grade is determined from 10X magnification. c. Model performance during training and validation. Shown is cross-entropy loss for the multi-task model assessing the biopsy state and for the single-task model estimating the rejection grade. Reported is slide-level performance at 40× for the multi-task model, while the grading scores are measured at 10X magnification. The model with the lowest validation loss encountered during the training is used as the final model.