Fig. 3: Different types of prediction errors. | Neuropsychopharmacology

Fig. 3: Different types of prediction errors.

From: Deep learning for small and big data in psychiatry

Fig. 3

a Contours of two Gaussian distributions associated with two fictional populations (red and blue) showing a probabilistic relationship between a feature and an outcome (e.g., brain volume reduction and age). The ellipses mark points of equal probability density at standard deviations σ = 1, 2, 3, indicating the spread of the Gaussians. The red population shows slightly less spread (potentially related to stricter inclusion criteria or differences in the measurement device used for this population). b Two random samples of n = 30 points drawn from both distributions (indicated by corresponding colors). c 50% of the red sample (depicted in B) is used to fit a linear model (thick skewed red line). The remaining 50% of sample points (the test set), here displayed as white circles, are used to evaluate the out-of-sample error (red vertical lines). Another sample of outcomes is drawn at the exact same feature values used for training (orange circles) and used to evaluate the in-sample prediction error (orange vertical lines). d The model (red line, same as in c) is now employed to predict the outcome for the blue (more broad) sample (potentially collected at a different site). The blue vertical lines mark the out-of-domain prediction error. This error appears to be larger than both the other errors (c) and indicates a systematic underestimation of the outcome.

Back to article page