Comprehensive analysis of single-cell and bulk RNA sequencing uncover tumor microenvironment diversity in invasive Retinoblastoma

Wan, Wei; Chen, Xiaoqian; Liu, Hongli; Yang, Ling; Huang, Pingping

doi:10.1038/s41598-025-23779-1

Download PDF

Article
Open access
Published: 14 November 2025

Comprehensive analysis of single-cell and bulk RNA sequencing uncover tumor microenvironment diversity in invasive Retinoblastoma

Wei Wan¹^na1,
Xiaoqian Chen¹^na1,
Hongli Liu¹,
Ling Yang¹ &
…
Pingping Huang¹

Scientific Reports volume 15, Article number: 39954 (2025) Cite this article

2782 Accesses
Metrics details

Subjects

Abstract

In this study, we analyzed publicly available scRNA-seq data from primary tumor tissues of 10 Retinoblastoma (RB) patients to explore tumor microenvironment (TME) heterogeneity. Our findings revealed distinct subpopulations of cone precursor (CP) cells, with higher proportions in invasive RB. Differential gene expression and pathway analysis highlighted functional diversity among CP subpopulations, with CP4 showing elevated TGF-β signaling in invasive RB. Cell–cell interaction analysis further identified rewiring of communication networks, with increased fibroblast–CP interactions in invasive tumors. Bulk RNA-seq identified two molecular subtypes, with subtype 1 showing an immunosuppressive TME. Finally, DOK7 was identified as a key gene associated with invasion, with functional assays confirming its role in promoting tumor progression. These results provide valuable insights into RB progression and potential therapeutic targets.

Single-cell characterization of malignant phenotypes and microenvironment alteration in retinoblastoma

Article Open access 06 May 2022

Single-cell transcriptome profiling reveals intratumoural heterogeneity and malignant progression in retinoblastoma

Article Open access 23 November 2021

In vitro model of retinoblastoma derived tumor and stromal cells for tumor microenvironment (TME) studies

Article Open access 18 December 2024

Methods

Sample selection

Tumor samples were obtained from retinoblastoma patients undergoing primary enucleation (PE). No patients received prior systemic or intraocular therapy. Enucleation status and clinical details are provided in Supplementary Table 1.

Data acquisition and preprocessing

The single-cell RNA sequencing (scRNA-seq) dataset of retinoblastoma, comprising 10 individual RB samples, was obtained from the Gene Expression Omnibus (GEO) under the accession numbers GSE249995 and PRJNA737188 (https://www.ncbi.nlm.nih.gov/geo/). Additionally, bulk RNA-seq data were retrieved from GEO under the accession number GSE57362.

Clustering, dimension reduction, and cell annotation of scRNA-seq

The processing of single-cell RNA sequencing data was performed using the “Seurat” R package (version 4.2.0)¹. The dataset was normalized via the log-normalization method. And the highly variable genes (top 2000) were identified with the “FindVariableFeatures " function using the R package Seurat. Subsequently, the data were scaled using the “ScaleData” function, and to mitigate potential batch effects, we employed the “RunHarmony” function². After this, dimensionality reduction was performed through principal component analysis (PCA) with the “RunPCA” function, focusing on highly variable genes. The first 20 principal components (dims = 20) were selected for downstream clustering, which was achieved using the “FindNeighbors” and “FindClusters” functions at a resolution of 0.5. To identify marker genes across 10 identified subgroups, the “FindAllMarkers” function was employed with thresholds of log2 fold change > 0.25 and a minimum percentage (min.pct) of 0.25. Marker genes were filtered using a corrected p-value threshold of < 0.05.

Sub-clustering of CP cell subpopulations

To further characterize the heterogeneity within the CP (Cone Precursor cells) cluster, we performed sub-clustering analysis using the Seurat R package. First, cells within the CP cluster were extracted and re-processed following the standard Seurat workflow, including “NormalizeData”, identification of highly variable features (FindVariableFeatures vst = 2000), and “ScaleData”. Principal component analysis (PCA) was then performed, and the top principal components were used for UMAP dimensionality reduction. Clustering was carried out using the “FindNeighbors” and FindClusters = 0.4 functions with an appropriate resolution parameter to identify finer subpopulations.

Identification of the malignant cone precursor cells

The “InferCNV” package (version 1.6.0) was employed to infer copy number variants (CNVs) in CP (Cone Precursor) cells using immune cells as the reference group. The analysis was performed with default parameters. To estimate the CNV signal for individual CP cells, a 100-gene sliding window approach was applied. Prior to analysis, genes with a mean count of less than 0.1 across all cells were filtered out to reduce noise. The CNV signal was further denoised using a dynamic threshold derived from the mean signal intensity. Based on the CNV accumulation scores, CP cells were classified into distinct groups to identify potential malignant or abnormal cell populations.

We calculated a CNV score for each cell based on the “inferCNV” output matrix. Specifically, we first subtracted 1 from each CNV value to center the data, then calculated the average of the absolute values across all genes for each cell. This CNV score reflects the extent of deviation from diploid status, thus providing a quantitative measure of chromosomal abnormality.

Pseudo-time trajectory analysis

The “CytoTRACE” is a computational tool that estimates the developmental potential of single cells by measuring transcriptional diversity. Cells with higher “CytoTRACE” scores are inferred to be less differentiated and more progenitor-like. This approach facilitates the reconstruction of cellular differentiation trajectories and is especially useful for identifying immature cell populations within complex tissues. Pseudo-time trajectory analysis was performed to investigate the developmental dynamics of cone precursor cells. The DEGs were first identified using the Seurat package. The R package Monocle (version 2.4) was subsequently used to construct the pseudo-time trajectory using default parameters³. Pseudo-time ordering was performed using the “ReduceDimension” function, with the maximum number of components set to 2 and the reduction method specified as “DDRTree”. To identify genes significantly associated with developmental progression, the “differentialGeneTest” function was employed.

Cell–cell communication analysis

Intercellular communication was investigated using “CellPhoneDB” (version 2.0.0), which computes the significance of cell–cell interactions by analyzing ligand-receptor pairs based on normalized expression matrices and permutation testing⁴. ligand–receptor interactions with nominal p values < 0.05 were retained for further analysis. To explore interactions in greater depth, the “NicheNet” framework was applied. This approach links ligands expressed in one cell type to target genes expressed in another, enabling the identification of key signaling pathways influencing specific cellular behaviors.

Comparison of cell–cell interactions in invasive versus non-invasive RB

Differential ligand–receptor interactions between high-risk and invasive RB samples were assessed using the Mann–Whitney U test. Interaction scores for each ligand-receptor pair across all cell types were compared between pre-treatment response groups. To account for multiple hypothesis testing, adjusted p values were calculated using the false discovery rate (FDR) method proposed by Benjamini and Hochberg. Ligand-receptor pairs with adjusted p values less than 0.05 were considered statistically significant.

Molecular gene classifications of the RB patients

The “ConsensusClusterPlus” R package (version 1.72.0) was applied to explore the molecular classification of bulk RNA-seq cohorts based on the expression pattern⁵. In unsupervised consensus clustering based on k-means machine learning, 80% of the data were iterations of the 1,000-iter clustering algorithm. The relative change in the consensus score under the CDF curves was used to calculate the ideal number of clusters. The prognosis of patients between two subgroups was then assessed using a Kaplan–Meier survival analysis. In order to further investigate the relationships between the ligand-receptor pairs gene and the clinical characteristics, we carried out comparisons of the histopathological features between two subgroups.

Gene set enrichment analysis and functional annotation

Differentially expressed genes (DEGs) were analyzed for functional enrichment using the R package “ClusterProfiler”. The Gene Ontology (GO) analysis covered three main domains: cellular components (CC), biological processes (BP), and molecular functions (MF). Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment was also conducted to identify key pathways associated with the DEGs. To explore pathways linked to different molecular subtypes, Gene Set Enrichment Analysis (GSEA) was performed. The analysis utilized comprehensive gene sets from the Molecular Signature Database (MSigDB), enabling the identification of enriched pathways across the subtypes.

Cell-type and immune infiltration assessment between subgroups

To accurately estimate the proportions of various cell types within tissues, cell type deconvolution was performed using the “CIBERSORT” algorithm, which analyzes input gene expression matrices to infer immune cell compositions. Variations in immune cell proportions across different groups were compared to identifying distinct immune profiles. To explore the association between risk scores and infiltrating immune cells, Spearman correlation analysis was conducted. The results of the correlation analysis, along with the differences in immune cell abundances, were visualized using the “ggplot2” package in R.

Weighted gene co-expression network analysis and module gene selection in invasive vs. non-invasive

WGCNA was used to build gene co-expression networks and identify modules distinguishing invasive and non-invasive RB⁶. Outlier samples were removed using “goodSamplesGenes” and a scale-free network was constructed from the top 10,000 highly expressed genes. The optimal soft threshold (β) was determined with “pickSoftThreshold”. Modules were identified via “blockwiseModules”, and sample clustering was visualized with “plotDendroAndColors”. Module correlations were shown with a labeled heatmap, and associations between modules were illustrated with plotEigengeneNetworks. Gene significance (GS) and module membership (MM) correlations were calculated, and module-specific genes were extracted for further analysis.

Statistical analysis

The R programming language was used for all statistical studies (version 4.2.0). For the correlation analysis, Spearman’s correlation was used. To examine the differences between these two risk groups, the Wilcoxon test was applied. Statistical significance was defined as < 0.05.

Cell culture and cell transfection

The Y79 cell line (American Type Culture Collection, ATCC) was maintained in RPMI-1640 medium supplemented with 10% fetal bovine serum (FBS) and 1% penicillin/streptomycin (P/S) under standard culture conditions (37 °C, 5% CO₂). For transfection experiments, cells were seeded in 6-well plates at a density of 3 × 10⁵ cells/well and allowed to adhere for 24 h before performing transient transfection using Lipofectamine 2000 (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s protocol. Three distinct DOK7-targeting siRNA sequences were employed:

5′-GCAATGATGTCCTCGTCTTGG-3′ (siRNA#1),
5′-GCCAAGCGGATTCATCTTTGA-3′ (siRNA#2),
5′-GCTGATGCTGGTCTACAAGGA-3′ (siRNA#3).

Quantitative polymerase chain reaction

Total RNA was extracted from cells using TRIzol reagent and reverse-transcribed into cDNA. qPCR was performed using SYBR Green, with triplicate reactions under standard cycling conditions. Data were normalized to GAPDH using the ΔΔCt method.

The CCK8 colony formation, transwell assays

Cells were reseeded in 96-well plates (3 × 10³ cells/well) and cultured for 0, 24, 48, and 72 h. Cell Counting Kit-8 reagent (Dojindo Laboratories, Kumamoto, Japan) was added (10 µL/well) followed by 2-h incubation at 37 °C. Absobance at 450 nm was measured using a microplate reader (BioTek Instruments, USA), with blank medium-containing wells serving as reference.

Transfected cells were plated in 6-well plates (500 cells/well) and maintained in complete medium for 10–14 days with medium renewal every 3 days. Resultant colonies were fixed with 4% paraformaldehyde (15 min), stained with 0.1% crystal violet (30 min), and quantified using ImageJ software (NIH, USA).

For migration assessment, 1 × 10⁵ cells in serum-free medium were seeded into upper chambers (8-µm pores, Corning Costar, USA). For invasion evaluation, chambers were pre-coated with Matrigel (1:8 dilution, BD Biosciences). Complete medium containing 10% FBS served as chemoattractant in lower chambers. After 24-h incubation, migrated/invaded cells were fixed, stained, and counted from five random fields per membrane using phase-contrast microscopy (200 × magnification).

Introduction

Retinoblastoma (RB) is a rare but highly aggressive pediatric ocular malignancy, originating from the retinal precursor cells^7,8. It remains a leading cause of childhood blindness and death if left untreated. Retinoblastoma is classified into two main forms: invasive and non-invasive⁹. Invasive tumors are characterized by their ability to spread beyond the retina, while non-invasive tumors remain confined to the ocular region¹⁰. The disparity in clinical outcomes between these two forms is largely attributed to differences in tumor biology, with invasive retinoblastomas exhibiting a more aggressive phenotype, higher rates of metastasis, and poor prognosis. Despite significant advances in therapeutic approaches, the molecular and cellular mechanisms driving tumor invasiveness and metastasis remain poorly understood⁸.

RB is clinically categorized into noninvasive and invasive subgroups based on tumor behavior¹¹. With non-invasive cases characterized by tumors confined to the retina and subretinal space, sometimes involving the vitreous, while invasive cases exhibit tumor infiltration into the choroid, optic nerve, sclera, or beyond, with a significant risk of metastasis. Over recent decades, various treatment strategies, including enucleation, cryotherapy, brachytherapy, and systemic, intraarterial, or intravitreal chemotherapy, have been developed to address RB^12,13. Patients with invasive RB are at heightened risk of tumor metastasis, which severely impacts survival outcomes, emphasizing the critical need for early diagnosis to enable timely intervention and optimize treatment strategies to preserve vision, the eye, or life itself. Therefore, identifying distinct characteristics of invasive RB compared to noninvasive cases using noninvasive screening methods is essential for developing early diagnostic and prognostic markers that can improve patient management and outcomes¹⁴.

In recent years, the tumor microenvironment (TME) has gained increasing attention as a critical factor in tumor progression^13,15,16,17. The TME is composed of a variety of cell types, including tumor cells, immune cells, fibroblasts, and endothelial cells, each contributing to the tumor’s growth, invasiveness, and therapeutic resistance. In retinoblastoma, however, the cellular and molecular landscape of the TME, particularly in relation to invasive and non-invasive disease, remains underexplored^18,19,20,21. The integration of single-cell RNA sequencing (scRNA-seq) and bulk RNA-seqpresents a promising strategy to elucidate the complex cellular interactions within the TME and identify key drivers of retinoblastoma progression^22,23,24.

Bulk RNA-seq lacks the ability to resolve cellular-level heterogeneity. By integrating scRNA-seq and bulk RNA-seq, we can combine the high-resolution cellular insights of scRNA-seq with the comprehensive gene expression data of bulk RNA-seq^25,26. This approach allows us to build a more complete picture of the TME, shedding light on the molecular differences between invasive and non-invasive retinoblastoma and identifying novel targets for therapeutic intervention.

One of the unique aspects of retinoblastoma is its origin in retinal precursor cells, which can differentiate into various retinal cell types, including cone photoreceptor cells. Cone cells are primarily responsible for color vision and visual acuity, and they represent a crucial component of the retina’s architecture. Recent studies suggest that cone cells, particularly in the context of retinoblastoma, play a critical role in tumor progression^18,20. Invasive retinoblastomas often exhibit malignant transformation of cone precursor cells, leading to enhanced proliferation and metastatic potential. These cone-derived tumor cells may contribute to the aggressive phenotype of invasive tumors, altering the local microenvironment to promote invasion and resistance to treatment^16,27,28.

This study hypothesizes that cone cells in the retina play a pivotal role in the malignant progression and invasiveness of retinoblastoma. It aims to examine the specific transcriptional signatures of cone cells within the tumor microenvironment (TME), comparing their presence and activity in invasive versus non-invasive retinoblastoma samples. Understanding how cone cells influence tumor behavior may uncover novel mechanisms underlying tumor progression, including the modulation of immune cell infiltration, stromal remodeling, and angiogenesis. Furthermore, targeting cone cell-derived pathways may provide novel therapeutic strategies for inhibiting tumor progression and improving clinical outcomes for patients with invasive retinoblastoma.

In this study, we aim to integrate scRNA-seq and bulk RNA-seq data to comprehensively characterize the tumor microenvironment in invasive and non-invasive retinoblastoma samples. By focusing on cone cells and their malignant progression, we aim to elucidate the critical molecular pathways that govern retinoblastoma invasiveness. Specifically, we will examine the differential expression of genes associated with cone cell differentiation, immune cell interactions, and extracellular matrix remodeling within the tumor microenvironment (TME). This integrative approach is expected to not only enhance our understanding of retinoblastoma biology but also identify key biomarkers and potential therapeutic targets that may be leveraged to improve treatment strategies for this challenging disease.

Result

Diverse cell types in retinoblastoma (RB) were characterized using scRNA-seq analysis

We analyzed the scRNA-seq data from the GEO database (GEO accession number: GSE249995), which included primary tumor lesions and adjacent normal tissues from 10 RB patients, to explore the tumor microenvironment (TME) (Fig. 1A)^18,20. These 10 samples encompassed different RB subtypes, including 4 invasive RB, 6 non-invasive RB. After quality control and doublet removal, a total of 90,316 cells were retained for analysis^1,2,29. Specifically, we obtained 41,753 cells from invasive RB and 48,563 cells from non-invasive RB patients (Fig. 1B). The relative proportion and number of cells in each cluster are depicted in Fig. 1C. Our analysis revealed that nearly all cell populations were present in both non-invasive and invasive RB, though their distribution varied significantly across patients (Fig. 1C, D). To annotate the cell types, we used classic marker genes identified in previous studies. Epithelial cells expressed high levels of PSCA and SPINK1, while T cells were characterized by elevated levels of CD3E. KRT5 + epithelial cells displayed high expression of KRT5, and endothelial cells expressed ACKR1. And fibroblast expressed DCN gene within the single-cell transcriptome. As shown in Fig. 1E, the proportions of Cone Precursors (CP) cells, microglial cells, and MKI67 + cone precursors were significantly higher in invasive RB patients than non-invasive RB patients (Fig. 1E). These cell populations may play crucial roles in shaping the TME in invasive RB patients. These findings suggest that variations in cellular composition of RB, the CP cells might correlate with mechanisms of tumor invasion and metastasis.

Aberrant gene expressions in cone precursor cells between invasive and non-invasive RB

Invasive RB is associated with a higher likelihood of metastasis to distant organs, which significantly impacts patient survival rates. Therefore, early diagnosis of invasive RB is crucial to enable timely intervention and the implementation of optimal treatment strategies, ultimately improving survival outcomes³⁰. Previous studies have suggested that RB originates from maturing cone precursor cells (CPs). It is important to note that invasive RB often evolves from non-invasive forms^7,10. However, the cell type composition of non-invasive RB, especially in non-invasive lesions, remains poorly characterized. Recent research has indicated that CPs represent one of the most abundant cell types in RB. To explore the heterogeneity of epithelial cell types during retinoblastoma differentiation, we performed unsupervised clustering analysis to subclassify the precursor cells. We identified six distinct subtypes of CPs, each characterized by distinct marker genes. (Fig. 2A). The proportion of cells in the CP cell subpopulation was different in the invasive vs. non-invasive groups. The CP5 cluster was exclusive to the invasive group, while CP6 was unique to the invasive group, reflecting significant sample heterogeneity and suggesting distinct roles in tumor (Fig. 2B). The “inferCNV” analysis further revealed significant differences in copy number variation between epithelial subtypes derived from malignant cells and normal tissue epithelial cells, suggesting functional heterogeneity of epithelial cells within the tumor (Fig. 2C). These findings emphasize the critical role of epithelial cells in RB. To further elucidate the functional characteristics of the CP subpopulations, Gene Set Variation Analysis (GSVA) was conducted. The results revealed distinct pathway activities among the subpopulations: the CP4 subpopulation exhibited significantly elevated activity in the TGF-β signaling pathway, while the CP3 subpopulation showed increased activity in pathways associated with hypoxia and G2M checkpoint regulation. In contrast, the CP2 subpopulation exhibited reduced activity in these pathways but displayed pronounced activation of the IL6-JAK-STAT3 pathway, suggesting a unique immunoreactive profile. These findings highlight the functional heterogeneity within the CP subpopulations and their potential contributions to shaping the tumor microenvironment (Fig. 2D). To better understand the functional differences in cell types between non-invasive RB and invasive RB, we identified both upregulated and downregulated genes of CP cells between them. We observed that CP4 exhibited more cell type specific upregulated DEGs than downregulated DEGs (Fig. 2E). The upregulated DEGs were primarily enriched in pathways associated with RNA splicing and oxidative phosphorylation, whereas the downregulated DEGs were involved in regulating pathways such as ribosome biogenesis, RNA catabolic processes, and cytoplasmic translation (Fig. 2F, G). This contrasting pattern highlights distinct functional alterations in cellular processes between the compared conditions.

Heterogeneity of CP cells during the invasion and metastasis of RB patients

To explore the developmental trajectories of CP cell subpopulations, we organized them based on their differentiation states. Using “CytoTRACE”, we identified CP4 as the initiation state with higher differentiation potential, while CP2 and CP6 represented terminal states in the differentiation trajectory (Fig. 3A, B). Pseudo-time analysis revealed that CP1 acted as a transitional state between CP2 and CP6 clusters (Fig. 3B). Notably, CP cells from some patients clustered closely at the end of the branches, indicating a homogeneous terminal phenotype, whereas others exhibited diverse and heterogeneous profiles along the cancer developmental trajectories. Further analysis using Monocle2 elucidated the developmental trajectory of malignant CP cells, demonstrating that different clusters occupied distinct stages of differentiation (Fig. 3C, D). KEGG pathway analysis revealed distinct functional enrichments in the four clusters. Cluster 1 was enriched in pathways related to cytoplasmic translation, ribonucleoprotein complex biogenesis, ribosomal biogenesis, and protein-RNA complex assembly, indicating a strong association with protein synthesis and ribosome-related processes³¹. Cluster 2 was enriched in DNA repair, chromosome segregation, DNA replication, and regulation of chromosome organization, highlighting its role in genomic stability and cell division. Cluster 3 was associated with pathways including neuron projection development, cell growth, and regulation of cell projection organization, suggesting involvement in neuronal development and morphogenesis. Finally, Cluster 4 exhibited enrichment in cellular respiration, oxidative phosphorylation, and energy precursor generation, emphasizing its role in energy metabolism and ATP production. These findings underscore the functional diversity and specialization among the clusters, highlighting their distinct roles in cellular processes (Fig. 3E). In summary, this comprehensive analysis highlights the dynamic developmental trajectories and functional heterogeneity of CP cell subpopulations, providing valuable insights into their roles in tumor progression and potential therapeutic targets.

Comparison of cell–cell interactions in the TME between invasive and non-invasive RB patients

To further investigate potential cell–cell interactions within the TME, we employed the “CellChat” software^16,28. Distinct differences in cell–cell communication patterns were observed between invasive and non-invasive RB patients. For instance, in non-invasive RB patients, CP6 predominantly interacted with cone cells, whereas in the invasive RB, cone cells showed stronger interactions with fibroblasts (Fig. 4A). To quantify these differences, we calculated the ratio of total interaction frequencies for each cell pair between the two groups (Fig. 4B). Notably, the interaction ratios between invasive and non-invasive RB revealed that Rod/CP5 and retinoma-like cell/CP5 interactions were the most upregulated, while fibroblasts/CP6 interactions were the most downregulated. These findings highlight the dynamic rewiring of cell–cell communication networks during RB progression and suggest potential mechanisms underlying tumor invasiveness. (Fig. 4B). To delve deeper into the molecular mechanisms driving changes in cellular interactions within the TME of invasive and non-invasive RB, we utilized the “NicheNet” R package. This tool enabled us to explore ligand-receptor crosstalk by analyzing expression profiles and downstream targets of ligand-receptor pairs. Specifically, we employed the “nichenet_seuratobj_aggregate” function to conduct differential cell communication analysis and reconstruct receptor-ligand regulatory networks²⁸. In our analysis, we designated immune cells and stromal components—such as fibroblasts, retinoma-like cells, rod cells, neural cells, and cone cells—as the sender population to map out cell–cell regulatory networks. Our results revealed a multitude of significantly differentially expressed ligand-receptor pairs between invasive and non-invasive tumors, including BMP7–BMPR2, EFNA5–EPHA8, and LAMB1–CADM3 (Figs. 4C, S1). This comprehensive characterization of ligand-receptor interactions (LRIs) in low- and non-invasive RB underscores the pivotal role of gene expression dynamics among diverse cell types within the TME in the initiation and progression of RB. The KEGG pathway enrichment analysis revealed significant pathways associated with ferroptosis, including the p53 signaling pathway, cellular senescence, apoptosis, and cell cycle. Additionally, key signaling pathways such as JAK-STAT, HIF-1, and PI3K-AKT were identified (Fig. 4D). We examined the expression of receptor–ligand pair genes across different cell types. For example, BMP7 was predominantly expressed in CP cells and neural cells, whereas CD99 was mainly expressed in a subset of CP cells (Fig. 4E). These findings collectively underscore the critical role of dynamic ligand–receptor interactions and signaling pathway modulation in shaping the tumor microenvironment and driving retinoblastoma progression.

Molecular characteristics of different molecular subtypes of RB based on gene expression

Bulk RNA-seq captures the average gene expression across all cells, while scRNA-seq enables precise cell-type-specific transcriptome profiling and the identification of aberrant gene expression at the resolution of individual cells^1,16,28. From the DEGs, we pinpointed genes with statistically significant associations for survival analysis, encompassing tissue-specific genes, cell-type-specific aberrant genes, and pseudo-temporal differentiation genes tracing the transition from normal to tumor cells (Fig. 5A). By evaluating the cumulative distribution function (CDF) and delta area, we identified k = 2 as the optimal clustering parameter, yielding two robust molecular subtypes with distinct transcriptional programs (Fig. 5B–D). Differential gene expression analysis was conducted between the two clusters, which led to the identification of a large set of DEGs), suggesting marked transcriptional differences between these cell populations (Fig. 5E, F). Differential expression analysis revealed that subtype 1 was predominantly enriched in proliferative and biosynthetic pathways, including cell cycle, DNA replication, ribosome, spliceosome, ferroptosis, and PI3K–AKT signaling, whereas subtype 2 was associated with immune and microenvironmental signaling, showing enrichment in focal adhesion, ECM–receptor interaction, Cytokine–Cytokine receptor interaction, Toll-like receptor, NF-κB, TNF, HIF-1, and T-helper cell differentiation pathways (Fig. 5G, H). Together, these findings suggest that the common DEGs identified at both bulk and single-cell resolutions not only serve as potential prognostic biomarkers for RB patients but also reflect an immunosuppressive TME, offering insights into the molecular mechanisms driving tumor heterogeneity and progression.

The immune landscape of RB patients in different subtypes

We investigated the relationships between patient subgroups and immune infiltration in RB patients. To do this, we used the “CIBERSORT” method for cell-type deconvolution and applied the Wilcoxon test to compare the distribution of 22 infiltrating immune cell types across different RB molecular subgroups (Fig. 6A, B). Notably, we found that, except for central memory CD8 + T cells and Treg cells, the distribution of most immune cell types significantly differed between subgroups. Subtype 1 was enriched in regulatory T cells and M2 macrophages, consistent with an immunosuppressive tumor microenvironment (TME) (Fig. 6C, D). In addition, several ICD-related genes (e.g., CXCL10, FPR1, HGF, IFNAR1, PARX7, TLR4, ANXA1, TLR3) were significantly upregulated in subtype 1 (Fig. 6E). However, this upregulation did not correspond to enhanced immune activation, suggesting that ICD-related signaling may be insufficient to overcome the suppressive milieu of subtype 1. By contrast, subtype 2 exhibited higher expression of immune checkpoint molecules, including CD200, CD70, IDO1, LAG3, NRP1, PDCD1, PDCD1LG2, TIGIT, TNFRSF25, TNFRSF4, TNFRSF8, TNFRSF9, TNFSF4, and VSIR (Fig. 6F), indicative of checkpoint-mediated immunoregulation. Taken together, these findings demonstrate that subtype 1 is characterized by an immunosuppressive TME, whereas subtype 2 may be shaped by checkpoint-dependent immune regulation.

Weighted gene co-expression network construction between invasive and non-invasive

To identify regulatory genes associated with the onset of RB tumor invasion, we analyzed data from invasive and non-invasive patient groups. Correlation networks were constructed to identify clusters of highly correlated genes in microarray samples. Using Weighted Gene Co-expression Network Analysis (WGCNA), we built and analyzed active correlation networks. After clustering the samples, an appropriate threshold (cut Height = 120) was set to remove obvious outliers (Fig. 7A). A soft threshold power (β = 10) was applied to create a neighbor-joining matrix, ensuring the gene distribution adhered to a scale-free network while retaining valuable connectivity information. With parameters minModuleSize = 30 and merge CutHeight = 0.25, a total of 10 modules were generated and identified (Fig. 7B). To explore the relationship between modules and phenotypes, we calculated the correlation coefficients of each module with invasive traits. The results revealed that one module (MEGreen) exhibited a statistically significant correlation with the invasive phenotype (Fig. 7C). We further screened the top10 most connected genes from MEgreen, and we found that DOK7 gene was significantly more highly expressed in the invasive group than in the non-invasive group (Fig. 7D). qPCR analysis revealed that DOK7 expression was significantly downregulated in the siRNA#1 group compared to the NC group, whereas no notable reduction was observed in siRNA#2 or siRNA#3 groups (Fig. 7E). Consequently, siRNA#1 was selected for DOK7 silencing. Functional assays showed that DOK7 knockdown reduced Y79 cell proliferation, as confirmed by CCK-8 and colony formation assays (Fig. 7F). It also inhibited migration, as shown by the scratch assay (Fig. 7G). Moreover, DOK7 knockdown inhibited migration in the scratch assay. It also reduced both migration and invasion in the Transwell assay (Fig. 7H, i). Taken together, these results indicate that DOK7 plays a critical role in promoting proliferation, migration, and invasion of Y79 cells. Targeting DOK7 may therefore represent a potential therapeutic strategy for inhibiting the invasive behavior of these cells.

Discussion

This study provides a detailed characterization of the tumor microenvironment (TME) in RB using single-cell RNA sequencing (scRNA-seq) to analyze the differences between invasive and non-invasive RB^32,33,34. Our findings highlight the crucial role of the TME in determining the aggressiveness of invasive RB and suggest potential avenues for therapeutic targeting.

We identified significant differences in cellular composition between invasive and non-invasive RB. Invasive tumors exhibited a higher proportion of cone precursor cells, microglial cells, and MKI67 + cone precursors, which may contribute to tumor progression and immune modulation. These cell populations likely play key roles in shaping the TME, promoting an immunosuppressive environment that facilitates tumor invasion and metastasis. Understanding these cellular dynamics is critical for advancing our knowledge of RB pathogenesis and identifying novel therapeutic strategies.

Further analysis revealed that invasive RB is associated with a more immunosuppressive TME, marked by elevated levels of regulatory T cells (Tregs) and myeloid-derived suppressor cells (MDSCs). These immune cells contribute to immune evasion and tumor progression, which is consistent with the poor prognosis observed in invasive RB patients. Notably, macrophage subpopulations, such as macrophage-0, macrophage-2, and macrophage-3, were found to correlate with survival outcomes, further emphasizing the importance of immune cell composition in RB prognosis.

In addition, we explored cell–cell interactions within the TME using “CellphoneDB” and “NicheNet” analyses. Distinct differences were observed in the interaction patterns between cell types in invasive and non-invasive RB. For example, epithelial cells in invasive RB interacted more closely with myo-CAFs, a hallmark of aggressive tumors. These interactions are crucial for the progression of RB and underscore the importance of stromal-tumor crosstalk in invasive disease.

A potential limitation of our study is the possible confounding effect of tumor stage and burden. Invasive retinoblastomas are frequently associated with higher ICRB stage, optic nerve or choroidal invasion, and vitreous or subretinal seeding, all of which reflect greater tumor burden. Therefore, some of the observed TME differences—such as the enrichment of immunosuppressive cell populations and enhanced fibroblast–tumor interactions—may partly reflect disease stage rather than invasiveness itself. To address this concern, we performed subgroup analyses stratified by ICRB stage and seeding status, which showed that invasive tumors retained distinct TME features even after adjusting for stage.

Our study also identified two molecular subtypes of RB from gene expression profiles. Subtype 1, characterized by active cell cycle and DNA replication pathways, correlates with increased tumor aggressiveness. In contrast, subtype 2 showed upregulation of immune checkpoint genes and correlates with an immunosuppressive TME, potentially explaining its poorer response to conventional therapies. In conclusion, our comprehensive analysis highlights the pivotal role of the TME in RB invasiveness. The identification of key cellular and immune populations, as well as specific cell–cell interactions, provides crucial insights into the mechanisms underlying RB progression. These findings may guide the development of targeted therapies focused on modulating the TME to enhance patient outcomes, particularly for those with invasive RB.

Data availability

The single-cell RNA sequencing (scRNA-seq) dataset of retinoblastoma, comprising 10 individual RB samples, was obtained from the Gene Expression Omnibus (GEO) under the accession numbers GSE249995 and PRJNA737188 (https://www.ncbi.nlm.nih.gov/geo/). Additionally, bulk RNA-seq data were retrieved from GEO under the accession number GSE57362.

References

Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587e3529 (2021).
Article CAS PubMed PubMed Central Google Scholar
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with harmony. Nat. Methods. 16, 1289–1296 (2019).
Article CAS PubMed PubMed Central Google Scholar
Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods. 14, 979–982 (2017).
Article CAS PubMed PubMed Central Google Scholar
Troulé, K. et al., CellPhoneDB v5: inferring cell–cell communication from single-cell multiomics data. Nat. Protoc. 19,4635–4667 (2024).
Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010).
Article CAS PubMed PubMed Central Google Scholar
Zhang, B. & Horvath, S. A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4, Article17 (2005).
Article MathSciNet PubMed Google Scholar
Luo, C. & Deng, Y. P. Retinoblastoma: concerning its initiation and treatment. Int. J. Ophthalmol. 6, 397–401 (2013).
CAS PubMed PubMed Central Google Scholar
The global challenge of cancer. Nat. Cancer. 1, 1–2 (2020).
Retinoblastoma Nat. Rev. Dis. Primers 1, 15034 (2015).
Kamihara, J. et al. Retinoblastoma and neuroblastoma predisposition and surveillance. Clin. Cancer Res. 23, e98–e106 (2017).
Article CAS PubMed PubMed Central Google Scholar
Singh, L. & Kashyap, S. Update on pathology of retinoblastoma. Int. J. Ophthalmol. 11, 2011–2016 (2018).
PubMed PubMed Central Google Scholar
Chévez-Barrios, P. et al. Metastatic and nonmetastatic models of retinoblastoma. Am. J. Pathol. 157, 1405–1412 (2000).
Article PubMed PubMed Central Google Scholar
Ancona-Lezama, D., Dalvin, L. A. & Shields, C. L. Modern treatment of retinoblastoma: a 2020 review. Indian J. Ophthalmol. 68, 2356–2365 (2020).
Article PubMed PubMed Central Google Scholar
Stenfelt, S., Blixt, M. K. E., All-Ericsson, C., Hallböök, F. & Boije, H. Heterogeneity in retinoblastoma: a Tale of molecules and models. Clin. Transl. Med. 6, 42 (2017).
Article PubMed PubMed Central Google Scholar
Wu, T. & Dai, Y. Tumor microenvironment and therapeutic response. Cancer Lett. 387, 61–68 (2017).
Article ADS CAS PubMed Google Scholar
Patel, A. P. et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344, 1396–1401 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Runa, F. et al. Tumor microenvironment heterogeneity: challenges and opportunities. Curr. Mol. Biol. Rep. 3, 218–229 (2017).
Article CAS PubMed PubMed Central Google Scholar
Wu, C. et al. Single-cell characterization of malignant phenotypes and microenvironment alteration in retinoblastoma. Cell Death Dis. 13, 438 (2022).
Article CAS PubMed PubMed Central Google Scholar
Xu, X. L. et al. Tumor-associated retinal astrocytes promote retinoblastoma cell proliferation through production of IGFBP-5. Am. J. Pathol. 177, 424–435 (2010).
Article CAS PubMed PubMed Central Google Scholar
Yang, J. et al. Single-cell transcriptome profiling reveals intratumoural heterogeneity and malignant progression in retinoblastoma. Cell Death Dis. 12, 1100 (2021).
Article CAS PubMed PubMed Central Google Scholar
Liu, Y. et al. Single-cell transcriptomics enable the characterization of local extension in retinoblastoma. Commun. Biol. 7, 11 (2024).
Article CAS PubMed PubMed Central Google Scholar
Tirosh, I. et al. Single-cell RNA-seq supports a developmental hierarchy in human oligodendroglioma. Nature 539, 309–313 (2016).
Article ADS PubMed PubMed Central Google Scholar
Zhang, L. et al. Single-cell analyses inform mechanisms of myeloid-targeted therapies in colon cancer. Cell 181, 442–459e429 (2020).
Article CAS PubMed Google Scholar
He, D. et al. Single-cell RNA sequencing reveals heterogeneous tumor and immune cell populations in early-stage lung adenocarcinomas harboring EGFR mutations. Oncogene 40, 355–368 (2021).
Article CAS PubMed Google Scholar
Liu, J. et al. A high-risk retinoblastoma subtype with stemness features, dedifferentiated cone states and neuronal/ganglion cell gene expression. Nat. Commun. 12, 5578 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods. 12, 453–457 (2015).
Article CAS PubMed PubMed Central Google Scholar
Tirosh, I. et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 352, 189–196 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Jin, M. Z. & Jin, W. L. The updated landscape of tumor microenvironment and drug repurposing. Signal Transduct. Target Ther. 5, 166 (2020).
Article PubMed PubMed Central Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
Gulati, K., Poluri, K. M. & Kaliki, S. Assessing the metabolic variations of invasive and noninvasive unilateral retinoblastoma patients. ACS Omega. 9, 40082–40094 (2024).
CAS PubMed PubMed Central Google Scholar
Kanehisa, M., Furumichi, M., Sato, Y., Matsuura, Y. & Ishiguro-Watanabe, M. KEGG: biological systems database as a model of the real world. Nucleic Acids Res. 53, D672–d677 (2025).
Article PubMed Google Scholar
Sun, D. et al. Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data. Nat. Biotechnol. 40, 527–538 (2022).
Article CAS PubMed Google Scholar
Metzker, M. L. Sequencing technologies—the next generation. Nat. Rev. Genet. 11, 31–46 (2010).
Article CAS PubMed Google Scholar
Liao, J. et al. De Novo analysis of bulk RNA-seq data at spatially resolved single-cell resolution. Nat. Commun. 13, 6498 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar

Download references

Funding

This research was supported by the Hubei Provincial Key Laboratory Open Project in 2022 (2022KFH012, 2022KFH019).

Author information

Wei Wan and Xiaoqian Chen contributed equally.

Authors and Affiliations

Eye Center, Renmin Hospital of Wuhan University, 238 Jiefang Road, Wuhan, 430060, Hubei Province, People’s Republic of China
Wei Wan, Xiaoqian Chen, Hongli Liu, Ling Yang & Pingping Huang

Authors

Wei Wan
View author publications
Search author on:PubMed Google Scholar
Xiaoqian Chen
View author publications
Search author on:PubMed Google Scholar
Hongli Liu
View author publications
Search author on:PubMed Google Scholar
Ling Yang
View author publications
Search author on:PubMed Google Scholar
Pingping Huang
View author publications
Search author on:PubMed Google Scholar

Contributions

L Y., and P.H., conceived the project and designed the experiments. P. H., and S.Y., were involved in the analysis of the experiments and wrote the manuscript. P. H., and S.Y., performed the bioinformatic analysis. W.W., C.X., contributed to the experiments and analyzed the data. All the authors discussed the results and reviewed the manuscript.

Corresponding authors

Correspondence to Ling Yang or Pingping Huang.

Ethics declarations

Competing interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Ethics declarations

Since all the data used in this study are from public datasets, no ethical approval is required.

Consent for publication

Not applicable.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (download DOCX )

Supplementary Material 2 (download DOCX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wan, W., Chen, X., Liu, H. et al. Comprehensive analysis of single-cell and bulk RNA sequencing uncover tumor microenvironment diversity in invasive Retinoblastoma. Sci Rep 15, 39954 (2025). https://doi.org/10.1038/s41598-025-23779-1

Download citation

Received: 25 February 2025
Accepted: 08 October 2025
Published: 14 November 2025
Version of record: 14 November 2025
DOI: https://doi.org/10.1038/s41598-025-23779-1

Subjects

Abstract

Similar content being viewed by others

Single-cell characterization of malignant phenotypes and microenvironment alteration in retinoblastoma

Single-cell transcriptome profiling reveals intratumoural heterogeneity and malignant progression in retinoblastoma

In vitro model of retinoblastoma derived tumor and stromal cells for tumor microenvironment (TME) studies

Methods

Sample selection

Data acquisition and preprocessing

Clustering, dimension reduction, and cell annotation of scRNA-seq

Sub-clustering of CP cell subpopulations

Identification of the malignant cone precursor cells

Pseudo-time trajectory analysis

Cell–cell communication analysis

Comparison of cell–cell interactions in invasive versus non-invasive RB

Molecular gene classifications of the RB patients

Gene set enrichment analysis and functional annotation

Cell-type and immune infiltration assessment between subgroups

Weighted gene co-expression network analysis and module gene selection in invasive vs. non-invasive

Statistical analysis

Cell culture and cell transfection

Quantitative polymerase chain reaction

The CCK8 colony formation, transwell assays

Introduction

Result

Diverse cell types in retinoblastoma (RB) were characterized using scRNA-seq analysis

Aberrant gene expressions in cone precursor cells between invasive and non-invasive RB

Heterogeneity of CP cells during the invasion and metastasis of RB patients

Comparison of cell–cell interactions in the TME between invasive and non-invasive RB patients

Molecular characteristics of different molecular subtypes of RB based on gene expression

The immune landscape of RB patients in different subtypes

Weighted gene co-expression network construction between invasive and non-invasive

Discussion

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Ethics declarations

Consent for publication

Additional information

Publisher’s note

Supplementary Information

Supplementary Material 1 (download DOCX )

Supplementary Material 2 (download DOCX )

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links