Extended Data Table 1 Overview of pitfall sources for image-level classification metrics ((a): counting metrics, (b): multi-threshold metrics) related to poor metric selection [P2]

From: Understanding metric-related pitfalls in image analysis validation

  1. Pitfalls for semantic segmentation, object detection and instance segmentation are provided in Extended Data Tables 25 respectively. A warning sign indicates a potential pitfall for the metric in the corresponding column, in case the property represented by the respective row holds true. Comprehensive illustrations of pitfalls are available in Supplementary Note 2. A comprehensive list of pitfalls is provided separately for each metrics in the metrics cheat sheets (Supplementary Note 3). Note that we only list sources of pitfalls relevant to the considered metrics. Other sources of pitfalls are neglected for this table. (a) Counting metrics. Considered metrics: Accuracy (Fig. SN 3.38), Balanced Accuracy (BA) (Fig. SN 3.39), Expected Cost (EC) (Fig. SN 3.42), Fβ Score (Fig. SN 3.43), Matthews Correlation Coefficient (MCC) (Fig. SN 3.46), Net Benefit (NB) (Fig. SN 3.47), Negative Predictive Value (NPV) (Fig. SN 3.48), Positive Likelihood Ratio (LR+) (Fig. SN 3.50), Positive Predictive Value (PPV) (Fig. SN 3.51), Sensitivity (Sens) (Fig. SN 3.52), Specificity (Spec) (Fig. SN 3.53), Weighted Cohen’s Kappa (WCK) (Fig. SN 3.54). (b) Multi-threshold metrics. Considered metrics: Area under the Receiver Operating Characteristic Curve (AUROC) (Fig. SN 3.55) and Average Precision (AP) (Fig. SN 3.56).