Extended Data Table 3 Distribution shift quantification in the OOD settings between different datasets

From: The limits of fair medical imaging AI in real-world generalization

  1. a, Label shift P(Y) was derived using the total variational distance between the probability distributions of Y between ID and OOD datasets. P values were computed using a two-sided proportion z-test. b, Covariate shift P(X) was derived by first encoding input into representations from a frozen foundation model f (that is, MedCLIP79) and then computing the MMD distance with a Gaussian kernel81 between ID and OOD datasets. P values were computed using a two-sided permutation test using this distance as the test statistic81. c, Prevalence shift P(Y|A = a) was derived using the total variational distance conditioned on specific subgroups between ID and OOD datasets. P values were computed using a two-sided proportion z-test. d, Representation shift P(X|A = a) was derived using the MMD distance on specific subgroups between ID and OOD datasets81. P values were computed using a two-sided permutation test using this distance as the test statistic81. All P values were adjusted for multiple testing using Bonferroni correction78.