npj Digital Medicine

Table 2 iESPer scores for RCC and KTX

From: Ecologically sustainable benchmarking of AI models for histopathology

RCC Task (iESPer)
MODEL	CO2eq/Slide (g)	AUROC	95%CI	ACCURACY	95%CI	PRECISION	95%CI	RECALL	95%CI	F1	95%CI
TransMIL	0.046	0.964	0.936–0.986	0.8	0.732–0.868	0.736	0.634–0.835	0.695	0.577–0.822	0.676	0.556–0.798
CLAM	0.048	0.937	0.908–0.960	0.815	0.753–0.874	0.833	0.764–0.897	0.54	0.442–0.661	0.583	0.453–0.717
InceptionV3	0.073	0.665	0.511–0.788	0.41	0.238–0.580	0.451	0.258–0.640	0.41	0.234–0.592	0.394	0.218–0.584
ViT	0.065	0.693	0.593–0.784	0.643	0.571–0.718	0.448	0.270–0.654	0.393	0.316–0.485	0.37	0.272–0.484
Prov-GigaPath	0.229	0.349	0.336–0.360	0.303	0.279–0.325	0.261	0.206–0.318	0.214	0.177–0.259	0.219	0.175–0.267
KTX Task (iESPer)
MODEL	CO2eq/Slide (g)	AUROC	95%CI	ACCURACY	95%CI	PRECISION	95%CI	RECALL	95%CI	F1	95%CI
TransMIL	0.046	0.579	0.501–0.660	0.279	0.209–0.368	0.428	0.322–0.533	0.265	0.200–0.345	0.257	0.184–0.343
CLAM	0.048	0.547	0.472–0.627	0.234	0.169–0.312	0.294	0.216–0.385	0.236	0.171–0.306	0.223	0.157–0.295
InceptionV3	0.073	0.391	0.334–0.451	0.062	0.038–0.097	0.007	0.004–0.011	0.093	0.093–0.093	0.017	0.012–0.024
ViT	0.065	0.439	0.373–0.511	0.068	0.037–0.105	0.008	0.004–0.012	0.1	0.100–0.100	0.019	0.011–0.026
Prov-GigaPath	0.229	0.188	0.160–0.222	0.065	0.044–0.089	0.110	0.062–0.168	0.055	0.041–0.072	0.042	0.025–0.062

Mean iESPer scores for each metric with corresponding confidence intervals (95%), for each examined model benchmarked on the RCC and KTX task, including CO2eq measurements per slide for inference, respectively.

Back to article page

Search

Advanced search

Quick links