Extended Data Fig. 5: Development and Validation of the Prediction Model for Determining LOYSCR Status in a Pan-Cancer scRNA-seq Dataset. | Nature

Extended Data Fig. 5: Development and Validation of the Prediction Model for Determining LOYSCR Status in a Pan-Cancer scRNA-seq Dataset.

From: Concurrent loss of the Y chromosome in cancer and T cells impacts outcome

Extended Data Fig. 5

a. Schematic of the development of the Random Forest model utilized to predict LOYSCR status in individual cells. b. Expression levels of 9 YchrS signature genes in male and female samples. Dot size indicates the proportion of expressing cells and color indicates mean expression levels. c. Proportion of predicted LOYSCR in samples in 6 major cell types in normal female and male samples from the scRNA-seq datasets. Error bars represent the 95% Confidence Interval (CI) of the mean value. In female samples, the number of analyzed cells per cell type was as follows: B/Plasma cells (n = 23,797), Endothelial cells (n = 24,755), Epithelial cells (n = 174,909), Fibroblasts (n = 69,688), Myeloid cells (n = 69,306), and T/NK cells (n = 171,458). In male samples, the corresponding numbers were B/Plasma cells (n = 25,999), Endothelial cells (n = 22,908), Epithelial cells (n = 157,029), Fibroblasts (n = 31,122), Myeloid cells (n = 74,698), and T/NK cells (n = 185,299). d. Expression levels of the 9 YchrS signature genes in LOYSCR and WTYSCR cells from male tumor samples. Dot size indicates the proportion of expressing cells and color indicates mean expression levels. e. Total counts per cell type, where cells are separated by their cell type and LOYSCR and WTYSCR status. Violin plots show the full distribution of total read counts per cell type. The box plots overlaid within each violin denote the median (center line) and the first and third quartiles (lower and upper edges of the box). Whiskers extend to either the minimum or maximum values within 1.5 × the interquartile range (IQR). Any data points outside these whiskers (if shown) are considered outliers. The number of cells analyzed per category was as follows: in LOYSCR samples, B/Plasma cells (n = 5,935), Endothelial cells (n = 7,975), Epithelial cells (n = 73,576), Fibroblasts (n = 9,950), Myeloid cells (n = 23,292), and T/NK cells (n = 40,675); in WTYSCR samples, B/Plasma cells (n = 20,064), Endothelial cells (n = 14,933), Epithelial cells (n = 83,453), Fibroblasts (n = 21,172), Myeloid cells (n = 51,406), and T/NK cells (n = 144,624). f. Scores for chromosome-specific signatures (with each signature comprising all genes from the corresponding chromosome) for male cells, where cells are separated by their LOYSCR and WTYSCR status. Data are presented as mean values ± 95% Confidence Interval (CI). The number of cells analyzed per category was as follows: in LOYSCR samples, B/Plasma cells (n = 5,935), Endothelial cells (n = 7,975), Epithelial cells (n = 73,576), Fibroblasts (n = 9,950), Myeloid cells (n = 23,292), and T/NK cells (n = 40,675); in WTYSCR samples, B/Plasma cells (n = 20,064), Endothelial cells (n = 14,933), Epithelial cells (n = 83,453), Fibroblasts (n = 21,172), Myeloid cells (n = 51,406), and T/NK cells (n = 144,624). g. Correlations between YchrS from bulk RNA-seq data and average Y chromosome Copy Number Alteration (CNA) from whole-exome sequencing (WES) data (left), the correlation between YchrS and the proportion of LOYSCR cells identified through single-cell RNA sequencing (middle), and the correlation between the proportion of LOYSCR cells and average Ychr CNA (right). Male and female samples are represented by blue and red dots, respectively, with each dot representing a sample. Lines show the linear regression results for male and female samples, with shaded regions showing the 95% confidence interval. R, Pearson correlation coefficient; P-value is calculated by Pearson correlation test.

Back to article page