Extended Data Fig. 6: Potential technical confounders of cell cycle estimates.

a. Scatter plots showing cell cycle proportion and phase bias from bespoke (y axis) and consensus (x axis) G1/S and G2/M gene signatures. Each point corresponds to one cell type in one dataset. Red lines denote y = x, and r denotes Pearson correlation. b. Bar plot showing average percentage of cycling cells in 10x datasets (y axis) in each cell type (x axis; n = 53, 51, 33, 9, 58, 26, 17, 42, 29, 44, 19, 45, in order from left to right). Error bars denote standard error. c. Heatmap showing Spearman correlation between cell types (colour) of percentages of cycling cells in 10x datasets. Significant correlations are labelled with p values (2.0 × 10−2, 8.1 × 10−5, 2.2 × 10−2, 5.6 × 10−3, 2.0 × 10−2, 2.0 × 10−2, 3.9 × 10−6, 5.1 × 10−4, 1.6 × 10−3, 5.9 × 10−3, 6.1 × 10−6, 5.9 × 10−6, 1.3 × 10−4, 1.1 × 10−4, 7.1 × 10−3, 8.6 × 10−3, 1.2 × 10−3, 5.9 × 10−6, 4.8 × 10−2, 1.5 × 10−2, 4.2 × 10−2, 9.3 × 10−4, 2.0 × 10−2, 1.1 × 10−23, 5.9 × 10−3, in order from left to right and top to bottom), which were computed by two-tailed test of zero correlation via algorithm AS 8953, and adjusted to FDR < 0.05 (all p values are provided in the Source Data). d. Bar plot showing percentage of cycling malignant cells (y axis) in each 10x dataset (bars), grouped by cancer type (x axis). Crosses denote the average for each cancer type, weighted by number of samples containing at least 10 malignant cells. Bar colour categorises studies by number of such samples, and values above the plot denote the total number of such samples per cancer type. e. Bar plot showing phase bias (y axis) of malignant cells in each 10x dataset (bars), grouped by cancer type (x axis). Crosses, bar colour and number of samples per cancer type are as in d. f. Scatter plots showing, for each cell type and sequencing platform, percentage of cycling cells (y axis) against number of detected genes (x axis) in each dataset (points). Regression lines and Pearson correlation were computed with and without outliers (red and blue respectively, or purple in cases with no outliers). Average correlations and p values (computed by two-sided t test, without adjustment) are shown at the top. g-i. Scatter plots as in a showing: percentage of cycling cells against number of captured cells; phase bias against number of detected genes; phase bias against number of captured cells.