Fig. 7: An intuitive demonstration of Morgan’s Canon of data in cat-dog classification. | Nature Communications

Fig. 7: An intuitive demonstration of Morgan’s Canon of data in cat-dog classification.

From: Mitigating data bias and ensuring reliable evaluation of AI models with shortcut hull learning

Fig. 7

a The boxes represent three probability spaces \(({\Omega }_{1},{{{{\mathcal{F}}}}}_{1},{{\mathbb{P}}}_{1})\), \(({\Omega }_{1},{{{{\mathcal{F}}}}}_{1},{{\mathbb{P}}}_{2})\), and \(({\Omega }_{2},{{{{\mathcal{F}}}}}_{2},{{\mathbb{P}}}_{3})\), where \({\Omega }_{2}\in {{{{\mathcal{F}}}}}_{1}\) and \({{{{\mathcal{F}}}}}_{2}={{{{\mathcal{F}}}}}_{1}{| }_{{\Omega }_{2}}=\{E\cap {\Omega }_{2}| E\in {{{{\mathcal{F}}}}}_{1}\}\). The corresponding ID sample sets within each space are \({{{{\mathcal{E}}}}}_{{{{\rm{ID1}}}}}\), \({{{{\mathcal{E}}}}}_{{{{\rm{ID2}}}}}\), and \({{{{\mathcal{E}}}}}_{{{{\rm{ID3}}}}}\) with \({{{{\mathcal{E}}}}}_{{{{\rm{ID2}}}}}={\Omega }_{1}\), and \({{{{\mathcal{E}}}}}_{{{{\rm{ID1}}}}}={{{{\mathcal{E}}}}}_{{{{\rm{ID3}}}}}={\Omega }_{2}\). In the measurable space \(({\Omega }_{1},{{{{\mathcal{F}}}}}_{1})\), the event representing dogs is defined as Sdog Tdog, where Sdog and Tdog denote shape and texture events for dogs, respectively. Similarly, the event cats is defined as Scat Tcat, where Scat and Tcat represent shape and texture events for cats. {Sdog TdogScat Tcat} constitutes a partitioning of the sample space Ω1. In another measurable space \(({\Omega }_{2},{{{{\mathcal{F}}}}}_{2})\), the event Sdog ∩ Tdog representing dogs comprises the same samples as it does in Ω1. Likewise, the event Scat ∩ Tcat represents cats. {Sdog ∩ TdogScat ∩ Tcat} constitutes a partitioning of the sample space Ω2. b The boxes denote random variables (X1Y1), (X2Y2), and (X3Y3) defined respectively over these three probability spaces, where ω Ω1X1(ω) = X2(ω), and ω Ω2X1(ω) = X2(ω) = X3(ω), Y1(ω) = Y2(ω) = Y3(ω). c An intuitive illustration of the possible partitionings \({{{{\mathcal{Y}}}}}_{1}=\{\sigma ({Y}^{{\prime} })| {Y}^{{\prime} }{=}^{a.s.}{Y}_{1},\sigma ({Y}^{{\prime} })\subseteq \sigma ({X}_{1})\}\), \({{{{\mathcal{Y}}}}}_{2}=\{\sigma ({Y}^{{\prime} })| {Y}^{{\prime} }{=}^{a.s.}{Y}_{2},\sigma ({Y}^{{\prime} })\subseteq \sigma ({X}_{2})\}\), and \({{{{\mathcal{Y}}}}}_{3}=\{\sigma ({Y}^{{\prime} })| {Y}^{{\prime} }{=}^{a.s.}{Y}_{3},\sigma ({Y}^{{\prime} })\subseteq \sigma ({X}_{3})\}\) respectively induced by Y1, Y2, and Y3, where \(| {{{{\mathcal{Y}}}}}_{1}| > 1\), \(| {{{{\mathcal{Y}}}}}_{2}|=| {{{{\mathcal{Y}}}}}_{3}|=1\). \({{{{\mathcal{Y}}}}}_{1}\) and \({{{{\mathcal{Y}}}}}_{2}\) illustrate the impact of \({{{{\mathcal{E}}}}}_{{{{\rm{ID}}}}}\) on \({{{\mathcal{Y}}}}\) under a fixed \((\Omega,{{{\mathcal{F}}}})\), while \({{{{\mathcal{Y}}}}}_{1}\) and \({{{{\mathcal{Y}}}}}_{3}\) illustrate the effect of the Ω on \({{{\mathcal{Y}}}}\) when \({{{{\mathcal{E}}}}}_{{{{\rm{ID}}}}}\) remains constant.

Back to article page