Extended Data Fig. 6: Quantification of B-ALL developmental states and multipotency score.

a) Development of a 99-gene regression model to quantify PC1, hereafter named the ‘multipotency score’. Correlation results for 10-fold cross validation with 10 repeats are shown alongside correlation for the final model trained on all 2,046 samples. r and P values from Pearson correlation are shown. b) B-ALL multipotency score validation on donor-level pseudobulk profiles from normal B cell development atlas. Single-cell transcriptomes spanning n = 90 donors were pooled together based on tissue source, study, and sequencing technology into pseudobulk profiles for visualization of multipotency score enrichment. For the specific number of pseudobulk profiles as well as the number of cells for each cell state along B cell development, refer to Supplementary Table 11. Box plots indicate the range of the central 50% of the data, with the central line marking the median. Whiskers extend from each box to 1.5x the interquartile range. c) Ridge plot comparing the inferred abundance of HSC/MPP and pre-pDC states by genomic subtype across 2,046 B-ALL patient samples. d,e) Association plots between multipotency score or developmental state abundance and driver alterations (d) or gene fusions (e). The magnitude of each association, quantified as the –log10 (P value), is depicted through the size and color intensity of each dot. The direction of the association is depicted through the color, wherein higher abundance is green and lower abundance is purple. Only associations with an FDR-corrected P value < 0.05 are depicted. For driver alterations (d), abundances of samples with each alteration were compared to all other samples, and genomic subtype was adjusted for as a covariate. For gene fusions (e), abundances of samples with each fusion were compared to abundances from samples with no gene fusions. f-h) Differences in developmental state abundance (myeloid progenitors and pre-pDCs) between transcriptional subtypes of DUX4-R (total n = 112; DUX4-R early/multipotent = 70; DUX4-R committed = 42). (f), KMT2A-R (total n = 144; KMT2A-R early/multipotent = 125; KMT2A-R committed = 19). (g) and BCR::ABL1 (total n = 127; BCR::ABL1 early/multipotent = 32; BCR::ABL1 Inter = 26; BCR::ABL1 Ccommitted = 69). (h). i) Multipotency score and developmental state abundance explains differences between Early-Pro (early/multipotent, n = 24), Inter-Pro (n = 8), and Late-Pro (committed, n = 22) transcriptional subgroups of BCR::ABL1 (n = 54) from Kim et al.14. For each comparison, P values from a two-tailed Wilcoxon rank-sum test are reported. Box plots indicate the range of the central 50% of the data, with the central line marking the median. Whiskers extend from each box to 1.5x the interquartile range.