Figure 5

Overlap (OL) vs Overreach (ORgt) scores in 2022 versus 2015 (with updated masks). Best results should have high overlap (top) and low overreach (left). Top graphs: scores per bundle (averaged over all teams). Colors reflect the differences between easy (blue), average (green) and hard-to-track (pink) bundles, as in4. Bottom graphs: scores per submission (averaged over all bundles). Colors reflect the algorithm choice: deterministic (blue), probabilistic (orange) or others (gray). Arrows highlight the displacement of some submissions with strongly changed scores (generally, better OL, worst ORgt).