Fig. 7: An intuitive demonstration of Morgan’s Canon of data in cat-dog classification.
From: Mitigating data bias and ensuring reliable evaluation of AI models with shortcut hull learning

a The boxes represent three probability spaces \(({\Omega }_{1},{{{{\mathcal{F}}}}}_{1},{{\mathbb{P}}}_{1})\), \(({\Omega }_{1},{{{{\mathcal{F}}}}}_{1},{{\mathbb{P}}}_{2})\), and \(({\Omega }_{2},{{{{\mathcal{F}}}}}_{2},{{\mathbb{P}}}_{3})\), where \({\Omega }_{2}\in {{{{\mathcal{F}}}}}_{1}\) and \({{{{\mathcal{F}}}}}_{2}={{{{\mathcal{F}}}}}_{1}{| }_{{\Omega }_{2}}=\{E\cap {\Omega }_{2}| E\in {{{{\mathcal{F}}}}}_{1}\}\). The corresponding ID sample sets within each space are \({{{{\mathcal{E}}}}}_{{{{\rm{ID1}}}}}\), \({{{{\mathcal{E}}}}}_{{{{\rm{ID2}}}}}\), and \({{{{\mathcal{E}}}}}_{{{{\rm{ID3}}}}}\) with \({{{{\mathcal{E}}}}}_{{{{\rm{ID2}}}}}={\Omega }_{1}\), and \({{{{\mathcal{E}}}}}_{{{{\rm{ID1}}}}}={{{{\mathcal{E}}}}}_{{{{\rm{ID3}}}}}={\Omega }_{2}\). In the measurable space \(({\Omega }_{1},{{{{\mathcal{F}}}}}_{1})\), the event representing dogs is defined as Sdog ∪ Tdog, where Sdog and Tdog denote shape and texture events for dogs, respectively. Similarly, the event cats is defined as Scat ∪ Tcat, where Scat and Tcat represent shape and texture events for cats. {Sdog ∪ Tdog, Scat ∪ Tcat} constitutes a partitioning of the sample space Ω1. In another measurable space \(({\Omega }_{2},{{{{\mathcal{F}}}}}_{2})\), the event Sdog ∩ Tdog representing dogs comprises the same samples as it does in Ω1. Likewise, the event Scat ∩ Tcat represents cats. {Sdog ∩ Tdog, Scat ∩ Tcat} constitutes a partitioning of the sample space Ω2. b The boxes denote random variables (X1, Y1), (X2, Y2), and (X3, Y3) defined respectively over these three probability spaces, where ∀ ω ∈ Ω1, X1(ω) = X2(ω), and ∀ ω ∈ Ω2, X1(ω) = X2(ω) = X3(ω), Y1(ω) = Y2(ω) = Y3(ω). c An intuitive illustration of the possible partitionings \({{{{\mathcal{Y}}}}}_{1}=\{\sigma ({Y}^{{\prime} })| {Y}^{{\prime} }{=}^{a.s.}{Y}_{1},\sigma ({Y}^{{\prime} })\subseteq \sigma ({X}_{1})\}\), \({{{{\mathcal{Y}}}}}_{2}=\{\sigma ({Y}^{{\prime} })| {Y}^{{\prime} }{=}^{a.s.}{Y}_{2},\sigma ({Y}^{{\prime} })\subseteq \sigma ({X}_{2})\}\), and \({{{{\mathcal{Y}}}}}_{3}=\{\sigma ({Y}^{{\prime} })| {Y}^{{\prime} }{=}^{a.s.}{Y}_{3},\sigma ({Y}^{{\prime} })\subseteq \sigma ({X}_{3})\}\) respectively induced by Y1, Y2, and Y3, where \(| {{{{\mathcal{Y}}}}}_{1}| > 1\), \(| {{{{\mathcal{Y}}}}}_{2}|=| {{{{\mathcal{Y}}}}}_{3}|=1\). \({{{{\mathcal{Y}}}}}_{1}\) and \({{{{\mathcal{Y}}}}}_{2}\) illustrate the impact of \({{{{\mathcal{E}}}}}_{{{{\rm{ID}}}}}\) on \({{{\mathcal{Y}}}}\) under a fixed \((\Omega,{{{\mathcal{F}}}})\), while \({{{{\mathcal{Y}}}}}_{1}\) and \({{{{\mathcal{Y}}}}}_{3}\) illustrate the effect of the Ω on \({{{\mathcal{Y}}}}\) when \({{{{\mathcal{E}}}}}_{{{{\rm{ID}}}}}\) remains constant.