Fig. 4: Learning new GPs from query data.
From: Biologically informed deep learning to query gene programs in single-cell atlases

a, Distribution of single-cell latent representation values across newly learned GPs across different query data cell types for query IFN-β-treated cells and control cells. b,c, Comparison of overlap of the most influential genes dominating the variance in newly learned constrained B-cell nodes (b) and unconstrained nodes (c) with genes in existing related GPs and top genes obtained from the differential testing analysis. The terms ‘MYELOIDS_DEG’ and ‘B_CELLS_DEG’ refer to genes obtained from one versus all Wilcoxon rank-sum tests in the query control cells for each population, respectively. The myeloid population consists of CD14+ monocytes, CD16+ monocytes and DC populations. ‘INF_VS_CTRL_DEG’ denotes differentially expressed genes comparing IFN-β-treated and control cells. The existing GPs for c are those with maximal overlap with at least 12 genes with newly learned GPs. d–f, Visualization of newly learned GPs (for cells from the reference and query datasets with cell types present in the query dataset) discriminating specific cell types and states from the rest, such as B cells and myeloids with the effect of IFN removed (d) or B cells with the effect of IFN preserved (e,f). g–i, UMAP of expiMap’s latent space for the query dataset coloured by node 3 latent representation values (g), TMSB4X gene expression counts (h) and cell types (i). The dotted circle highlights DCs.