Fig. 5: Interactive CellWhisperer-based and conventional bioinformatics analysis of a scRNA-seq dataset. | Nature Biotechnology

Fig. 5: Interactive CellWhisperer-based and conventional bioinformatics analysis of a scRNA-seq dataset.

From: Multimodal learning enables chat-based exploration of single-cell data

Fig. 5

a, Import and exploration of user-provided scRNA-seq data in CellWhisperer. b, UMAP of the CellWhisperer transcriptome embeddings for the imported Colonic Epithelium dataset41 comprising scRNA-seq profiles of inflamed and noninflamed tissue biopsies of individuals with inflammatory bowel disease and healthy individuals. Cluster labels were generated by CellWhisperer and clusters were repositioned for compact visualization (interactive version: https://cellwhisperer.bocklab.org/colonic_epithelium). c, Zoomed-in view of the cluster labeled ‘Cycling ileal epithelial precursor cells’, colored by CellWhisperer scores for the free-text query: ‘Show me stem cells’. d, CellWhisperer chat about the top 100 cells with highest CellWhisperer score (query from c). e, Expression levels of the LGR5 gene mentioned in the CellWhisperer response (in d), plotted for the cell cluster from c. f, Histogram of CellWhisperer scores (query from c) for cells derived from inflamed versus noninflamed tissue. g, Outline of a conventional bioinformatics analysis that produces similar results as the interactive CellWhisperer analysis (af). h, UMAPs before and after batch effect correction using scVI. i, Cell type annotation using CellTypist with cluster-level majority voting. j, Identification of a cell subset labeled ‘Stem cells’ using CellTypist without cluster-level majority voting, plotted on top of the UMAP from i. k, Differentially expressed genes between putative stem cells (from j) and all other cells, ranked by log2-transformed fold change and colored by statistical significance (two-sided Wilcoxon test threshold: 0.0001) with a log2-transformed fold change of at least 1 (gray line). ***Adjusted P = 1.4 × 10−25. l, Differential expression of a generic stemness gene signature among the putative stem cells (from j) for cells from inflamed versus noninflamed tissue. Violin plots are shown, with inner box plots corresponding to the interquartile range and whiskers extending to the farthest data point within 1.5 times the interquartile range. **Adjusted P = 0.0024 (one-sided t-test).

Back to article page