Fig. 2: PrePR-CT accurately predicts the effect of a single perturbation in an unseen cell type.

a, UMAP visualization of cell types stimulated with IFNβ (n = 24,249 cells), as demonstrated in Kang et al. b, R2 values comparing the predicted versus observed mean (left) and standard deviation (right) of gene expression across seven cell types. Bars show the mean R2 over n = 100 bootstrap iterations with ± standard deviation error bars; black dots indicate individual bootstrap iterations. Results are shown for all genes and the top 100 DEGs of the held-out test cell type. c, UMAP visualization comparing predicted (pred) and observed samples of stimulated and control B cells (n = 3,789 cells). d, Comparison of R2 scores for the predicted mean and standard deviation of gene expression across CPA, scGen and PrePR-CT under the leave-one-cell-type-out setting (IFNβ perturbation). For each metric and gene set, R2 values are reported across the held-out cell types (n = 7). Box plots show the median (centre line) and interquartile range (25th–75th percentiles; box); whiskers extend to 1.5× the interquartile range. e, Scatter plots of predicted versus observed differential expression (mean stimulated − mean control) in B cells across gene subsets: all genes in the B cell graph (n = 3,176), highly expressed genes (n = 1,588), lowly expressed genes (n = 1,588), upregulated DEGs (n = 43) and downregulated DEGs (n = 43). Each point represents one gene. Red lines show linear regression fits and shaded areas indicate the 95% confidence intervals. R2 values are shown in each panel.