Fig. 1: Model development with environmentally sustainable performance (ESPer) and ESPer scores.
From: Ecologically sustainable benchmarking of AI models for histopathology

Study outline and environmentally sustainable performance (ESPer) metrics. a Workflow example for sustainable model development using the environmentally sustainable performance (ESPer) score (iESPer or fpESPer, see below). This includes various steps, ESPer metrics used for evaluation, and datasets. We used renal cell carcinoma subtyping (RCC) and kidney transplant disease classification (KTX) as use cases in our study. Based on the medical task, the weighting factor can be set upfront to prioritize between performance and ecological sustainability. The dataset row indicates which amount of data needs to be used for each step of model development. There are various approaches for model optimization, such as pruning, knowledge distillation, hyperparameter tuning, or quantization. These were described before and not tested here but were included in the figure to provide a more complete picture of model development. b Formula and a diagram for the inference environmentally sustainable performance (iESPer), where \({\rm{iESPe}}{{\rm{r}}}_{{\rm{i}},{\rm{Perf}}}\) is the iESPer score for model \({\rm{i}}\) in the comparison series and performance metric \({\rm{Perf}}\), \({{\rm{M}}}_{{\rm{i}},{\rm{Perf}}}\) is the measured metric for model \({\rm{i}}\), \(w\) is the weighting factor, \({{\rm{CO}}}_{2}{{\rm{eq}}}_{{\rm{i}},\inf }\) is the CO2eq produced by model \({\rm{i}}\) during inference and \({\rm{X}}^{\prime}\) is the range normalization operation for \({\rm{X}}\). c Formula and diagram for the future projection ESPer (fpESPer). The notation is similar to the formula in (b), with the addition that, \({{\rm{CO}}}_{2}{{\rm{eq}}}_{{\rm{i}},{\rm{train}}}\) is the CO2eq produced by model \({\rm{i}}\) during training and that \({{\rm{n}}}_{{\rm{usage}}}\) is the projected number of usages for the model.