Extended Data Table 1 Overview of pitfall sources for image-level classification metrics ((a): counting metrics, (b): multi-threshold metrics) related to poor metric selection [P2]

Pitfalls for semantic segmentation, object detection and instance segmentation are provided in Extended Data Tables 2–5 respectively. A warning sign indicates a potential pitfall for the metric in the corresponding column, in case the property represented by the respective row holds true. Comprehensive illustrations of pitfalls are available in Supplementary Note 2. A comprehensive list of pitfalls is provided separately for each metrics in the metrics cheat sheets (Supplementary Note 3). Note that we only list sources of pitfalls relevant to the considered metrics. Other sources of pitfalls are neglected for this table. (a) Counting metrics. Considered metrics: Accuracy (Fig. SN 3.38), Balanced Accuracy (BA) (Fig. SN 3.39), Expected Cost (EC) (Fig. SN 3.42), F_β Score (Fig. SN 3.43), Matthews Correlation Coefficient (MCC) (Fig. SN 3.46), Net Benefit (NB) (Fig. SN 3.47), Negative Predictive Value (NPV) (Fig. SN 3.48), Positive Likelihood Ratio (LR+) (Fig. SN 3.50), Positive Predictive Value (PPV) (Fig. SN 3.51), Sensitivity (Sens) (Fig. SN 3.52), Specificity (Spec) (Fig. SN 3.53), Weighted Cohen’s Kappa (WCK) (Fig. SN 3.54). (b) Multi-threshold metrics. Considered metrics: Area under the Receiver Operating Characteristic Curve (AUROC) (Fig. SN 3.55) and Average Precision (AP) (Fig. SN 3.56).

Quick links

Search