Fig. 4: P2.1: disregard of the domain interest. | Nature Methods

Fig. 4: P2.1: disregard of the domain interest.

From: Understanding metric-related pitfalls in image analysis validation

Fig. 4

a, Importance of structure boundaries. The predictions of two algorithms (Predictions 1 and 2) capture the boundary of the given structure in substantially different ways, but lead to the same DSC owing to the metric’s boundary unawareness. This pitfall is also relevant for other overlap-based metrics such as clDice, pixel-level Fβ score and IoU, as well as localization criteria such as box/approx/mask IoU, center distance, mask IoU > 0, point inside mask/box/approx and intersection over reference. b, Unequal severity of class confusions. When predicting the severity of a disease for three individuals in an ordinal classification problem, Prediction 1 assumes a much lower severity for Patient 3 than is actually observed. This critical issue is overlooked by common metrics (here, accuracy), which measure no difference with respect to Prediction 2, which assesses the severity much better. Metrics with pre-defined weights (here, expected cost (EC)) correctly penalize Prediction 1 much more than Prediction 2. This pitfall is also relevant for other counting metrics, such as BA, Fβ score, positive likelihood ratio (LR+), Matthew’s correlation coefficient (MCC), net benefit (NB), negative predictive value (NPV), positive predictive value (PPV), sensitivity and specificity.

Back to article page