Extended Data Fig. 4: Analysis of the K562 cell line in the sci-Plex experiment. | Nature Methods

Extended Data Fig. 4: Analysis of the K562 cell line in the sci-Plex experiment.

From: Deep generative modeling of sample-level heterogeneity in single-cell genomics

Extended Data Fig. 4: Analysis of the K562 cell line in the sci-Plex experiment.

MrVI was fit over 92 drugs each at four doses that passed the DE-gene filter. The analysis is performed in a way similar to Figure 4. a. and b. MDEs of the u and z latent spaces colored by the pathway of the drug used to treat each cell (left) and the cell cycle stage of each cell (right). For the MDEs colored by pathway, only the top 20 percent of samples based on distance from the vehicle are shown in full opacity. c. PCA of sample distance matrices. Left: scatterplot of all local sample distance matrices projected onto the top two principal components colored by cell-cycle stage displays no visual subclusters. Right: barplot of the proportion of variance explained against the number of principal components used. d. Barplot comparing MrVI against the benchmark methods for a performance metric that determines alignment with prior knowledge. Each bar represents the metric for one model fit, except for ‘Random’ which reports the 95% confidence interval over 100 permutations of the inferred distance matrix from MrVI. The average percentile of distances measures how much closer samples with the same drug and different doses are to each other relative to the rest of the distances. We expect the average percentile to be low. There was no available Connectivity Map data for the K562 cell line, so we could not compute the silhouette metric for this dataset. e. Hierarchically clustered sample distance matrix. Rows are annotated by the pathway, dose, and cluster of each sample (clusters inferred from the distance matrix). For e. and f., the analysis is performed over the top 20 percent of drug-dose combinations (74/368) based on their distance from the vehicle. f. Heatmap of Gene Set Enrichment Analysis (GSEA) scores for the Human MSigDB Hallmark gene set collection for DE genes identified for each cluster in panel e. Each tile’s upper-right and bottom-left triangles, respectively, represent scores for the set of upregulated and down-regulated DE genes. We observe a cluster of Bcr-Abl tyrosine kinase inhibitors (bosutinib, dasatinib, nilotinib) with a significant effect, which is absent from the other cell lines.

Back to article page