Fig. 1: Reconstruction and validation of EMT trajectories from transcriptomics data. | Nature Communications

Fig. 1: Reconstruction and validation of EMT trajectories from transcriptomics data.

From: Genomic and microenvironmental heterogeneity shaping epithelial-to-mesenchymal trajectories in cancer

Fig. 1

a Workflow for reconstructing the EMT trajectories from bulk/single-cell RNA-seq data. 1: Bulk and single-cell datasets are processed together to remove batch effects. 2: Dimensionality reduction using PCA is performed. 3: A k-nearest neighbours (kNN) algorithm is used to map new samples onto a reference EMT trajectory derived from scRNA-seq data. 4: Tumours are sorted by mesenchymal potential along an EMT pseudotime axis. T = EMT value at the specific time point, n = number of neighbours for sample i. b Distribution of EMT pseudotime values inferred using different single-cell RNA-seq templates. The consensus template combines all 10 datasets. c Application of the EMT trajectory reconstruction method to a time course experiment of A549 lung adenocarcinoma lines treated with TGF-beta. The pseudotime estimate increases with time as expected for gradually transforming cells. Replicates are depicted in different colours. d Scatter plot of EMT scores along the pseudotime across TCGA cancers. Each dot corresponds to a sample, coloured by its designated state. e Diagram of the transition probabilities for switching from one EMT state to another, as estimated by the HMM model. f EMT scores differ significantly across biologically independent samples from TCGA in the epithelial (n = 3388), hEMT (n = 2764), mesenchymal (n = 1028) category, and the MET500 cohort (n = 496) (Kruskal–Wallis test p < 2.2e-16). The box centerlines depict the medians, and the edges depict the first/third quartiles. g EMT scores compared between cell lines from CCLE classified as non-metastatic (n = 116), weakly metastatic (n = 249), metastatic (n = 111) according to MetMap500. The box centerlines depict the medians, and the edges depict the first/third quartiles. Two-sided Wilcoxon rank-sum test p values are displayed. h Association plot between the HMM-derived cell line states (rows) and their experimentally measured metastatic potential (columns) (conditional independence test p = 2.2e-16). Source data are provided as a Source Data file.

Back to article page