Extended Data Fig. 3: Proximity bias quantification in DepMap data.

a, Split genome-wide heatmap of the DepMap 22Q4 (above diagonal) and 23Q2 (below diagonal) CRISPR data. Both are processed with the Chronos pipeline17 but 23Q2 has an additional correction applied to reduce proximity bias. 22Q4 has 1,078 cell lines, 23Q2 has 1,095 cell lines. b, Distributions of arm-level Brunner-Munzel probabilities for maps built using pairs of autosomal chromosome arms (741 pairs represented twice in blue, green and red distributions). Blue distribution is built using all DepMap 22Q4 CRISPR-Cas9 cell lines, orange samples random cell lines matching the numbers from the green distribution (10 random sampling runs), green uses only cell lines with less than 1% of genes having copy number calls outside of [1.75, 2.25] (counts in Supplementary Table 4), and red uses all cell lines in the DepMap shRNA data. Two-sided Mann-Whitney U tests between all distributions are highly significant (p-value < 1E-10) for all pairwise comparisons. c, Boxen plots showing distributions of the ratio of within-chromosome-arm relationships to between-arm relationships for each chromosome arm across different gene annotation sets (n = 39 chromosome arms for all sources. Boxes are drawn at each octile with outliers outside of those boxes. The 19Q3 DepMap data show a much higher ratio of within-arm to between-arm annotations, suggesting a systematic bias to the predicted associations. d, Counts of gene-gene relationships within and between chromosome arms for shinyDepMap 19Q3 data (blue and tan, n = 4747 and 9271 respectively) and public annotation sets (Reactome, HuMap, and CORUM) (green and red, n = 98 and 2825 respectively)25,26,27. DepMap predicts a much higher proportion of within-chromosome-arm relationships than are found in public annotation sets (odds ratio 0.068, Fisher exact p-value < 1e-10).