Introduction

Cancer is the leading cause of death in humans and a significant impediment to increasing life expectancy. The global burden of cancer incidence and mortality is swiftly rising1. Statistical data indicate that lung cancer was the most frequently diagnosed cancer in 2022, with nearly 2.5 million new cases, succeeded by female breast cancer (11.6%), colorectal cancer (9.6%), prostate cancer (7.3%), and gastric cancer (4.9%)2. Among women, the malignant tumors with uniquely high incidence rates are breast cancer (BC), cervical cancer (CC), endometrial carcinoma (EC), and ovarian cancer (OC). The predominant challenge in treating these malignant tumors is the problem of treatment resistance3,4,5. Hence, gaining a more profound understanding of the mechanisms behind the onset and progression of these malignant tumors, as well as the pivotal role of the tumor microenvironment (TME) in cancer development, is likely to spur researchers on to identify more effective therapeutic targets, thereby offering valuable insights for the treatment of these conditions.

The onset, progression, and metastasis of cancer induce extensive dynamic changes within the host tissue, culminating in the formation of a complex tumor stroma, also known as the TME6,7,8. The development of this tumor stroma is typically accompanied by significant proliferative responses in connective tissue, which leads to the production of abundant fibrous and connective tissue. The tumor stroma is generally composed of extracellular matrix (ECM) components and a diverse array of cell types, including immune cells, fibroblasts, and vascular endothelial cells9. Within this context, specific subpopulations of fibroblasts play a crucial role in immune responses, inflammation, angiogenesis, metabolism, hypoxia, and the remodeling of the ECM. As such, a substantial body of research indicates that targeting fibroblasts could represent a promising therapeutic approach7,8,10.

Fibroblast populations found in both primary and metastatic cancers are collectively known as cancer-associated fibroblasts (CAFs), which are linked to the initiation, progression, and metastasis of tumors11,12,13. CAFs identified in primary and metastatic tumors are highly adaptable, plastic, and resilient cells that play an active role in cancer progression through intricate interactions with various cell types within the TME14. Besides generating ECM components that contribute to the structure and function of the tumor stroma, CAFs undergo epigenetic alterations that affect the secretion of factors, exosomes, and metabolites, which in turn influence tumor angiogenesis, immune responses, and metabolism15,16. Nevertheless, the exact role of CAFs in cancer progression is not yet fully understood. Evidence from several clinical studies suggests that the initial activation of fibroblasts is part of the host’s response and defense mechanism against the tumor17,18, and this response may involve the creation of physical barriers associated with the ECM and fibroplastic reactions19. Additional research indicates that following tumor development, cancer cells co-opt the activation of fibroblasts and wound healing mechanisms to facilitate tumor growth20. Consequently, fibroblasts may exhibit both tumor-supportive and tumor-suppressive capacities, which are largely influenced by the TME and underscore the functional heterogeneity of CAFs.

The TME provides a crucial niche for the onset and progression of cancer21,22. Inflammatory cells and mediators are integral components of the TME, with TAMs emerging as a paradigm of the connection between inflammation and cancer23. In cancers derived from various tissues, the inflammatory constituents of the TME display significant diversity. Nonetheless, the infiltration of bone marrow-derived monocytes, including monocytes, macrophages, and dendritic cells, is a universal characteristic of cancer24. Within the bone marrow monocyte population, macrophages possess the capability to eliminate tumor cells, mediate antibody-dependent cell cytotoxicity and phagocytosis, induce vascular damage and tumor necrosis, and activate tumor resistance pathways mediated by innate or adaptive lymphoid cells21,24. Yet, in most established tumors, macrophages can facilitate cancer progression and metastasis through diverse mechanisms, encompassing roles that bolster cancer cell viability and proliferation, promote angiogenesis, and dampen innate and adaptive immune responses21,25.

Currently, the crucial role of CAFs is largely neglected by most therapeutic methods, including immunotherapy and chemotherapy26. Furthermore, our present comprehension of the interactions between CAFs and TAMs within the TME is inadequate to underpin the development of dependable therapeutic strategies. Consequently, additional research is essential to enhance our understanding of these interactions and to facilitate the development of effective therapeutic interventions.

Recently, the advent of single-cell sequencing technology has allowed for the quantification of molecular alterations at the single-cell level, offering insights into the cellular responses to tissue changes. This provides a valuable framework for investigating the pathophysiological processes underlying diseases27. As a pioneering technique, it has demonstrated its efficacy as a robust tool for dissecting intercellular interactions and has also introduced new avenues for the clinical management of diseases. For example, in oncological research, single-cell sequencing facilitates the identification of cellular interactions and, leveraging existing knowledge, can pinpoint which cell types have transitioned to a cancerous state. This enhances our ability to explore the physiological onset and pathogenic mechanisms of tumors28.

In this study, we conducted an analysis of single-cell sequencing data of BC, CC, EC, and OC. We investigated the common functional features of various subtypes of CAFs and TAMs within the TME. Our findings revealed that in these cancer types, CAFs and TAMs play a significant role in the COLLAGEN signaling pathway via the COL1A1-CD44 ligand-receptor interaction and are also engaged in tumor angiogenesis. This breakthrough provides novel insights and directions for the development of therapeutic strategies targeting CAFs and TAMs in the TME.

Materials and methods

Single-cell RNA sequencing data processing

In this study, data for BC, CC, EC, and OC were retrieved from the GEO database: GSE248288, GSE197461, GSE208653, GSE225689, GSE251923, and GSE184880. Additionally, the GSE225689, GSE251923, GSE192898, GSE235329, GSE221371, GSE179705, and GSE186344 datasets were used as the validation set to verify the reliability of the results. Detailed information of the patients is included in this study was included in Supplementary Table S1.For these datasets, the “harmony” function from the “Seurat” package was initially employed to address batch effects. Subsequently, to maintain the integrity of the analysis, rigorous quality control measures were applied to the datasets, adhering to the stringent criteria outlined below: each cell was required to have a gene expression count of at least 100, an UMI count exceeding 500, a mitochondrial UMI ratio below 10%, and a red blood cell gene ratio less than 1% to be deemed a high-quality cell.

Gene set functional analysis

The gene sets MSigDB_Hallmark_2020 and WikiPathway_2021_Human were acquired via the “Enrichr” package. An enrichment analysis for marker genes was conducted with the R libraries “enrichr” and “clusterProfiler”. The “AddMoudleScore”29 function in the “Seurat” package was used to score the gene sets of enriched pathways. Through this scoring, the activity levels of these enriched pathways within the cell clusters can be obtained. Subsequently, the scoring results were visualized in the form of a heatmap.

InferCNV analysis

The “inferCNV” package in R was utilized to investigate copy number variations (CNVs) in epithelial cells. The analysis adhered to the package’s official guidelines to derive the raw count matrix, cell type annotation file, and gene/chromosome position annotation file. T cells were designated as the reference normal cells, and the denoising threshold was adjusted to 2.5, with all other settings maintained at their standard defaults.

Cell–cell communication analysis

The “CellChat” package within the R environment was employed to analyze a randomly chosen subset of 20,000 cells for their interaction patterns. A thorough investigation of the Secreted Signaling category within the CellChatDB highlighted overexpressed genes and receptor-ligand pairs, indicative of potential intercellular communication networks. For the purpose of inferring communication networks, cells groups containing fewer than 10 cells in certain populations were omitted. Visual representations were created to illustrate the number and strength of interactions, as well as diagrams of individual signaling pathways. The “NMF” package was used to categorize the total cell types by assessing incoming and outgoing signals. Ligand-receptor gene pairs that were most closely associated with each celltype were pinpointed based on the analysis of these signaling pathways.

Pseudotime trajectory analysis

Pseudotime trajectory inference was performed with Monocle (version 2.26.0) to deduce the differentiation path of cell development and the evolutionary procession of cell subtypes. In the Monocle analysis, genes that exhibited significant variation in expression across cells were chosen for the study. Next, the DDRTree algorithm in the reduceDimension function was utilized to compute the minimum spanning tree, with the aim of obtaining the trajectory lines of cellular differentiation. This ensures that cells are interconnected, and the sum of the weights between cells is minimized. Finally, the obtained trajectories were visualized using the plot_cell_trajectory function.

Results

Single-cell sequencing identifies cell types in BC, CC, EC, and OC

To gain insights into the gene expression patterns of gynecological oncology at a single-cell resolution, we retrieved the relevant datasets for BC (4 cases), CC (8 cases), EC (7 cases), and OC (3 cases) from the GEO database, specifically from the entries GSE248288, GSE197461, GSE208653, GSE225689, GSE251923, and GSE184880. Comprehensive details about the patients enrolled in this study are provided in Supplementary Table 1. Batch correction using the “harmony” algorithm showed that the data from different samples mixed well together. Additionally, no significant differences in cell distribution among the samples were found (Fig. 1A,B).

Fig. 1
figure 1

scRNA-seq reveals diverse cell types and tumor heterogeneity in different types of diseases. (A) UMAP plots displays the effects of different samples after batch processing. (B) UMAP plots illustrates the cell distribution across different samples. (C) UMAP plots showing the different samples into eight main cell types: B cells, Endothelial cells, Epithelial cells, Fibroblast cells, Macrophages, Neutrophils, NK_cell, and T_cells. (D) Bubble plots depicting the expression levels of marker genes for each cell types.

Next, we used the “SingleR()” function to identify the types of these cells. Ultimately, eight cell types were obtained, which include: B_cells (IGHG1, IGHG3), Endothelial_cells (ACKR1, AQP1), Epithelial_cells (WFDC2, SLPI), Fibroblasts (COL3A1, DCN), Macrophage (C1QB, C1QA), Neutrophils (FCGR3B, PTGS2), NK_cell (GNLY, KLRD1), T_cells (IL7R, CD3G), each characterized by a unique set of marker genes (Fig. 1C,D).

To explore the composition of different cell types in various diseases, a histogram of cell composition was plotted (Fig. 2A,B). The graph illustrates that in BC samples, epithelial cells and fibroblasts are predominant among non-immune cells, with non-immune cells comprising a significant portion of the overall cell composition. In CC and OC samples, epithelial cells and fibroblasts are also found in high abundance among non-immune cells; however, in these two disease types, non-immune cells constitute a smaller proportion of the total cell content, with immune cells being the most prevalent. Of note, the cell content graph reveals a pronounced heterogeneity in the proportion of non-immune cells within the BC-2 and CC-5 samples when contrasted with the other samples. Consequently, a more in-depth analysis of these two samples is essential to corroborate the robustness of the analytical outcomes. Moreover, in the EC samples, a notable disparity in the composition of various cell subtypes was detected in relation to other diseases. For example, epithelial cells are the most abundant within the non-immune cell population, while T cells, which are primarily involved in cellular immune responses, are comparatively less frequent. This observation leads us to hypothesize that epithelial cells may have a pivotal role in the initiation and progression of EC disease.

Fig. 2
figure 2

Cell type content maps for different diseases and interaction communication UMAP plots. (A,B) Cell content bar charts for different disease types and different samples. (C) The interaction communication intensity of major cell types in BC, CC, and OC.

Epithelial-mesenchymal transition (EMT) is the cellular process by which epithelial cells are transformed into individual mesenchymal cells with the capacity to migrate and invade adjacent tissues. During tumorigenesis and tumor progression, this EMT process is re-activated, ultimately aiding tumor cells in migrating through the basement membrane, invading neighboring tissues, and entering the circulation30. EMT is a multi-step process marked by the loss of cell-cell junctions and the rearrangement of the cytoskeletal network, resulting in the loss of epithelial polarity and the acquisition of a mesenchymal-like phenotype31. Indeed, numerous studies utilizing cell cultures and mouse models have shown that epithelial tumor cells can adopt a mesenchymal morphology associated with the expression of mesenchymal markers, such as vimentin, N-cadherin, E-cadherin, and α-smooth muscle actin (α-SMA)32. CAFs are an integral component of the TME and play a pivotal role in cancer progression, performing functions that either support or suppress tumor growth33. Therefore, this underscores the notion that epithelial cells and fibroblasts are the primary cell types involved in the initiation and progression of tumors.

Exploring the TME of different disease types using cellchat

The TME plays a crucial role at various stages of tumorigenesis, development, and metastasis. To understand the roles of cell types within the TME, CellChat was utilized to analyze the interactions of cells in different diseases within the TME (Fig. 2C). It was found that fibroblasts, endothelial cells, T cells, and macrophages were among the cell types with the strongest communication and interaction in the BC, CC, and OC cell role maps. Fibroblasts were determined to have the highest score for outgoing interaction strength, while macrophages were found to have the highest score for incoming interaction strength.

To detect the heterogeneity in the BC-2 and CC-5 samples, “CellChat” analyses were conducted separately for these two samples.The results revealed that in the BC-2 sample, fibroblasts and macrophages were the cell subtypes with the highest level of interaction (Fig S1A,B), which was consistent with the findings from the previous combined multi-sample analysis (Fig. 2C). In the CC-5 sample, Epithelial_cells were determined to be the cell subtype with a greater level of interaction in outgoing signals, whereas fibroblasts displayed a slightly reduced outgoing signal strength. The hypothesis posits that this discrepancy may be linked to the tissue sample selection, considering that the CC-5 sample is distinguished by its highly squamous characteristics typical of cervical adenocarcinoma, a subtype of disease with high invasiveness, and the fibroblasts cell subtype is closely associated with invasiveness, as well as demonstrating significant interaction strength in the “CellChat” analysis. In the EC sample, it was noted that Epithelial_cells were the cell subtype with the most robust interaction in outgoing signals, with macrophages being the cell type with the strongest interaction in incoming signals, and fibroblasts were seen to possess a diminished level of communication strength. Consequently, for a focused investigation into the specific role of fibroblasts in cellular communication, only BC, CC, and OC samples were chosen for subsequent analysis, while EC samples were not further analyzed.

Next, using the “netAnalysis_signalingRole” function, it was found that SPP1, MK, and MIF were signaling patterns common to both outgoing and incoming signaling patterns (Fig. 3A-C). It is speculated that these common signaling patterns may fulfill similar roles in the TME across different diseases. Subsequently, an “enrichR” analysis was performed on the ligand-receptor pairs within these signaling pathways, revealing that the ligands and receptors involved in these signaling patterns were significantly enriched in pathways including Epithelial Mesenchymal Transition, PI3K-Akt Signaling Pathway, Hippo-Merlin Signaling Dysregulation, Hypoxia, and Glycolysis (Fig S2A–C & Supplementary Table S2). Research indicates that solid tumor tissues frequently exhibit under-oxidation, characterized by a scarcity of functional blood vessels, which in turn creates “hypoxic regions” within the tumors34,35. Hypoxia has the capacity to modify the malignancy potential of tumors and their responsiveness to anti-tumor treatments. Critically, hypoxia stimulates the secretion of TGF-β and PDGF, factors that promote the differentiation of precursor cells into cancer-associated fibroblasts (CAFs) and bolster the expression of PDPN in these cells36. In the context of colorectal cancer, hypoxic CAFs over-secrete the TGF-β2 factor and robustly induce the expression of GLI2 via cancer stem cells, thereby enhancing the tumor’s resistance to therapeutic interventions. Moreover, previous research has demonstrated that hypoxia amplifies the expression and secretion of various immunosuppressive factors, including, but not limited to, IL6, IL10, VEGF, MMPs, and PD-L137,38. Consequently, our “enrichR” analysis revealed the presence of hypoxia-related enrichment pathways across these diseases, and we identified a strong association between these pathways and tumor development. Based on these findings, we propose that the fibroblast subpopulation may play a pivotal role in driving disease progression.

Fig. 3
figure 3

Signaling patterns in BC, CC, and OC. (A) Outgoing and incoming signaling patterns in BC. (B) Outgoing and incoming signaling patterns in CC. (C) Outgoing and incoming signaling patterns in OC.

Malignant CAFs

CAFs represent a group of highly versatile, malleable, and robust cells that actively participate in cancer progression through complex interactions with other cell types within the TME39. Critically, CAFs are characterized by their heterogeneity, and both their attributes and the nature of their interactions with various cell types within the TME can undergo significant alterations as the cancer evolves40. To delineate the functions of CAFs in gynecological cancers, fibroblasts were subsequently isolated from BC, CC, and OC samples to undergo detailed analytical.

To identify malignantly proliferating CAFs, “InferCNV” analysis was performed on the fibroblasts within these samples.The analysis of the cell content graph indicated that T_cells were the most prevalent immune cell type across various disease types and were less likely to undergo chromosomal copy number alterations during the onset and progression of cancer. Consequently, T_cells were selected as the reference normal cells in the “InferCNV” parameters. The analysis revealed that fibroblasts in different samples displayed varying degrees of malignant transformation (Fig. S3A–C), which also supports our hypothesis that the oncogenic transformation of CAFs contributes to the occurrence and development of diverse diseases.

Fibroblasts subtypes classification

Studies have highlight the necessity of studying the different subtypes of CAFs in their physiological functions41,42. To explore the specific roles of these CAFs in different diseases, malignant variant fibroblasts were subsequently extracted from each sample for further analysis. From the BC, CC, and OC samples, 1,427, 1,900, and 1,697 malignant fibroblasts were respectively extracted .

For the classification of subtypes, initially, the “harmony” algorithm was applied for quality control and batch correction. Next, the “FindNeighbors” and “FindClusters” functions was used to segment the cells into clusters, and finally, the “FindAllMarker” function was employed to identify the marker genes of these cell clusters. Following this sequence of steps, the subsequent classification results were obtained. In BC (Fig. 4A), two subtypes were identified: BC-iCAF (CXCL14) and BC-myCAF (MYH11). Similarly, in CC (Fig. 4B), the same two subtypes were identified: CC-iCAF (CXCL14) and CC-myCAF (RGS5). In OC (Fig. 4C), the fibroblast subtypes ultimately identified differed from those of the previous two diseases, being OC-proCAF (C7), OC-myCAF (MYH11), and OC-matCAF (POSTN) (Fig. S4). Research has indicated that matCAF is associated with the malignancy of tumors, generally being found in high quantities in the advanced stages of cancer33. Among the samples selected, OC-6 was identified as a late-stage high-grade serous OC sample. Through the cell composition plot (Fig. 4D), a high presence of matCAF in this sample was observed, which aligns with previous research and demonstrates that matCAF is closely associated with the progression of advanced tumors, exerting a significant influence on tumor progression.

Fig. 4
figure 4

Identification and classification of malignant fibroblast subtypes in different cancer types. (A). The tSNE plots shows two distinct subtypes of fibroblasts in BC, labeled as BC-iCAF (CXCL14) and BC-myCAF (MYH11). (B) The tSNE plots shows two subtypes of fibroblasts are identified in CC, namely CC-iCAF (CXCL14) and CC-myCAF (RGS5). (C) In OC, three fibroblast subtypes are identified, which are OC-proCAF (C7), OC-myCAF (MYH11), and OC-matCAF (POSTN). (D) The cell composition plot demonstrates a high abundance of matCAF in late-stage OC samples, indicating its association with advanced tumor development and progression.

To investigate the functions of these CAFs subtypes in different tumors, marker genes for each subtype were selected and subjected to “enrichR” analysis (Fig. S5A–-D). The analysis revealed that in BC and CC samples, iCAF was primarily associated with pathways such as Apical Junction, Epithelial Mesenchymal Transition, and Focal Adhesion-PI3K-Akt-mTOR-signaling. Conversely, myCAF was present across all tumor types and was significantly enriched in pathways such as VEGFA-VEGFR2 Signaling Pathway, Focal Adhesion, Epithelial Mesenchymal Transition, Myogenesis, Angiogenesis, Adipogenesis, and Apoptosis. proCAF and matCAF were exclusively present in OC samples, with each subtype enriched in distinct pathways. proCAF was enriched in pathways like Allograft Rejection, Myc Targets V1, and Interferon type I signaling, while matCAF was associated with Interferon Gamma Response, Glycolysis, and Metabolic reprogramming in colon cancer pathways. These analyses suggest that in BC and CC samples, both iCAF and myCAF subpopulations are present and associated with pathways such as Adhesion-PI3K-Akt-mTOR-signaling and Epithelial Mesenchymal Transition, indicating that CAFs subpopulations may affect disease onset and progression through the same signaling pathways in different diseases. In late-stage high-grade serous OC, a specific matCAF subpopulation was identified, which was enriched in pathways like Interferon Gamma Response, Glycolysis, and Photodynamic therapy-induced HIF-1 survival signaling, providing insights for the development of targeted therapies for late-stage high-grade serous OC.

The changes in CAFs subpopulations during disease progression

The classification of CAFs subpopulations within diseases led to the identified the main CAFs subtypes as iCAF, myCAF, matCAF, and proCAF. Research indicates that CAFs actively participate in cancer progression through complex interactions with other cell types within the TME. Moreover, CAFs are heterogeneous cells, and their characteristics and functions may dynamically change as cancer progresses11. To understand the developmental changes of each CAFs subpopulation across different diseases, “monocle” was subsequently employed for pseudotemporal analysis of the CAFs subpopulations. The analysis revealed (Fig. 5A-C) that in BC and CC samples, the iCAF subpopulation was at the beginning of differentiation, while myCAF was at the late stage. However, in OC samples, the differentiation trend was opposite, with proCAF at the early stage followed by the differentiation of proCAF into myCAF and a small portion of matCAF, while most matCAF was at the late stage. The pseudotime analysis outcomes indicated that matCAF is predominantly detected in the mid to late stages of differentiation and is notably enriched within malignant tumor samples.

Fig. 5
figure 5

Pseudotemporal analysis of cancer-associated fibroblast (CAFs) subpopulations across different cancer types. The pseudotemporal trajectory analysis using “monocle” reveals that the iCAF subpopulation is at the initial stage of differentiation, while the myCAF subpopulation is at the late stage in both BC (A) and CC (B) samples. (C) In contrast to BC and CC, the differentiation trend in OC samples shows an opposite sequence, with the proCAF subpopulation at the early stage, followed by differentiation into myCAF and a small portion of matCAF.

In BC and CC, iCAF was beginning and myCAF at late stage of differentiation. However, in OC, the differentiation trend was opposite. To verify whether this opposite direction of CAF differentiation is due to the cancer type itself or biased by the samples, datasets GSE192898 and GSE235329 were obtained from the GEO database. Following the pseudotime analysis performed on the CAFs within these samples, it was observed that matCAF was consistently at the late stage of differentiation, while the other CAFs subtypes were identified at the early stage (Fig. S6A–D). As outlined previously, this differentiation outcome suggests that the differentiation status of CAFs subpopulations varies between late-stage and early-stage tumor samples, in agreement with prior research findings33.

Throughout the progression of the disease, cells are not isolated entities but communicate through a series of complex signaling mechanisms, with transcription factors playing a crucial role. To identify the active transcription factors and corresponding changes in expression levels at different stages of disease progression, “SCENIC” analysis was conducted. The results revealed that the iCAF subpopulation was specifically recognized by CREB3L1 and ELF1, the myCAF subpopulation was specifically recognize by MEF2C and HCFC1, the proCAF subpopulation was specifically recognized by GATA4, and the matCAF subpopulation was specifically recognized by RUNX2 (Fig. S7A–D). Interestingly, it was also observed that CREB3L1 in iCAFs, MEF2C in myCAFs, GATA4 in proCAFs, and RUNX2 in matCAFs showed changes in expression levels corresponding to changes in cellular differentiation status (Fig. S8A–C). Therefore, based on the pseudotemporal and “SCENIC” analysis results, it was ultimately determined that the iCAF, myCAF, proCAF, and matCAF subpopulations were specifically recognized by CREB3L1, MEF2C, GATA4, and RUNX2, respectively. This suggests that specific transcription factors within the TME may play important roles in intercellular communication.

Communication between macrophages and CAFs

Through the previous analysis, it has been understood that CAFs constitute the most significant cell type in the progression of tumors. However, the onset of tumor development is not solely attributed to the subpopulations of CAFs, but rather to the extent of interaction and communication between immune cells and CAFs subpopulations within the TME. Via “CellChat” analysis presented in Fig. 3A-C, it was revealed that macrophages engage in the highest degree of interaction with CAFs subpopulations within the TME across various diseases. Research has shown that macrophages are highly heterogeneous cells that differentiate into different subtypes during tumor development and invasion. These subtypes undertake a range of functions against tumor cells, including phagocytosis, antigen presentation, and tissue remodeling43. Therefore, macrophages were subsequently isolated from each tumor type for analysis. The typing of macrophages was based on the marker genes of distinct cell clusters, with the typing results indicating (Fig. 6A–C, Fig. S9) that in BC, macrophages were divided into two subtypes: BC_Macro_FAM26F + and BC_Macro_CHIT1+. In CC, they were classified into three subtypes: CC_Macro_APOE+, CC_Macro_CD300E+, and CC_Macro_FCER1A+. In OC, they were divided into two subtypes: OC_Macro_IFI27 + and OC_Macro_FCN1+. To investigate the differentiation status of macrophage subtypes within the TME, “Monocle” pseudotemporal analysis was subsequently performed. The analysis revealed that in BC samples, macrophages differentiated from BC_Macro_FAM26F + to BC_Macro_CHIT1+, with FAM26F and CHIT1 genes being the predominant drivers of cell differentiation (Fig. 7A–C). In CC samples, macrophage differentiation was not as clear as in BC. It was observed that the CC_Macro_APOE + cell cluster persisted throughout differentiation, while the CC_Macro_CD300E + and CC_Macro_FCER1A + cell clusters regulated gene expression only in the early and middle stages of differentiation. In OC samples, the differentiation state was similar to that of CC, with the OC_Macro_FCN1 + subpopulation regulating gene expression only in the early stages of differentiation, while the OC_Macro_IFI27 + subpopulation persisted throughout differentiation. The results of cellular differentiation suggest that despite the presence of different macrophage subtypes, their differentiation states exhibit certain similarities to a degree. Therefore, it is hypothesized that macrophages and CAFs may share common pathways to influence tumor initiation and development.

Fig. 6
figure 6

Subtyping of macrophages based on marker gene expression in different cancer types. (A) The tSNE plots illustrates the two subtypes of macrophages identified in BC, labeled as BC_Macro_FAM26F + and BC_Macro_CHIT1+. (B) In CC, macrophages are categorized into three distinct subtypes, named CC_Macro_APOE+, CC_Macro_CD300E+, and CC_Macro_FCER1A+. (C) The typing results for OC macrophages reveal two subtypes, identified as OC_Macro_IFI27 + and OC_Macro_FCN1+.

Fig. 7
figure 7

Pseudotemporal analysis of macrophage subtypes differentiation within the TME using “Monocle”. (A) The pseudotemporal trajectory shows the differentiation path of macrophages in BC samples, transitioning from BC_Macro_FAM26F + to BC_Macro_CHIT1+. (B) The analysis reveals that the CC_Macro_APOE + cell cluster is present throughout the differentiation process, while CC_Macro_CD300E + and CC_Macro_FCER1A + cell clusters are active primarily in the early and middle stages of differentiation. (C) The differentiation state in OC samples resembles that, the OC_Macro_IFI27 + subpopulation is persistent throughout the differentiation process and the OC_Macro_FCN1 + subpopulation being active only in the early stages of differentiation.

COLLAGEN signaling patterns are regulated by CAFs and macrophages in the TME

To explore the cell-to-cell interactions between CAFs and macrophage subtypes within the TME, “CellChat” was subsequently employed for the analysis of cell interactions. The analysis revealedthat in BC samples, the two cell subtypes exhibiting the highest level of communication were iCAF and Macro_CHIT1+ (Fig. 8A & Fig. S10A). In CC samples, it was observed that iCAF and Macro_FCER1A + were the cell subtypes with the most intense communication (Fig. 8B & Fig. S10B), while in OC samples, matCAF and Macro_FCN1 + engaged in the most active cell-to-cell interactions (Fig. 8C & Fig. S10C). It was also found that across disease types, COLLAGEN signaling patterns were a common interface where CAFs functioned as ligands and macrophages as receptors. In the ranking of ligand-receptor contributions, the COL1A1-CD44 ligand-receptor pair was identified as having the highest contribution (Figs. S11A–C, S12A–C). Moreover, in light of the limited number of BC samples in the GSE248288 dataset, additional datasets-GSE221731, GSE179705, and GSE186344-were procured to corroborate the robustness of our analytical findings. Consistently, a notable contribution from the COL1A1-CD44 ligand-receptor pair within the COLLAGEN signaling pathways was observed (Fig. S13A,B), which underscores the credibility of the data analysis conducted.

Fig. 8
figure 8

Cell-to-cell interaction analysis between CAFs and macrophage subtypes in the tumor microenvironment using “CellChat”. (A) The CellChat analysis for BC samples shows that iCAF and Macro_CHIT1 + subtypes exhibit the highest level of communication among the cell subtypes within the tumor microenvironment. (B) In CC samples, the iCAF and Macro_FCER1A + subtypes are identified as having the most active cell-to-cell interactions. (C) In OC, matCAF and Macro_FCN1 + subtypes demonstrate the most significant cell-to-cell interactions.

To study the specific role of COLLAGEN signaling patterns in tumor progression, pseudotemporal algorithms were employed to obtain the changes in COLLAGEN signaling patterns and ligand-receptor interactions during cell differentiation (Fig. S14A–C). It was found that the expression of COLLAGEN signaling patterns reached its peak at the early stage of differentiation, subsequently decreasing. The change in expression of COL1A1 gene in iCAF was consistent with the alterations in COLLAGEN signaling patterns. Additionally, an opposite pattern of expression was noted between the COL1A1 ligand and the CD44 receptor within the COLLAGEN signaling patterns.

It has been demonstrated that COLLAGEN constitutes a major component of the ECM, with COL1A1 being the primary protein widely distributed throughout the body’s solid organs and connective tissues. Fibroblasts within the TME are responsible for synthesizing and remodeling the ECM44. As a crucial component of the ECM, the COL1A1 protein is primarily secreted and deposited by CAFs45. Consequently, we hypothesize that during tumor development, the iCAF subpopulation initially secretes COL1A1 protein in an autocrine fashion, thereby promoting tumor cell migration via the COLLAGEN signaling pathway. Subsequently, iCAF differentiates into the myCAF subpopulation. As previously indicated by enrichment analysis, the myCAF subpopulation is significantly enriched in the VEGFA_VEGFR2_Signaling_Pathway, suggesting that the myCAF subpopulation is primarily involved in angiogenesis within the TME and provides essential nutrients for tumor growth. Similarly, we also observed that in late-stage OC, the COLLAGEN signaling patterns also exert a comparable effect in the matCAF subpopulation. However, in contrast to the previous two diseases, the matCAF subpopulation in OC samples is present throughout the cell differentiation process, indicating that the matCAF subpopulation represents a mature cell subpopulation that participates in the entire course of tumor development within the TME.

TAMs promote angiogenesis

Based on the results of our prior analysis, it was found that CAFs subtypes in various diseases have different roles. Subsequently, an “enrichR” analysis was performed on the genes from the pseudotemporal analysis, revealing that the VEGFA_VEGFR2_Signaling_Pathway was the sole pathway significantly enriched across the different disease types (Fig. S15A–C). Additionally, in our earlier analysis in this study, it was found that the macrophage population secreting CD44 within the TME exhibited increasing expression levels as the CAFs subtypes underwent differentiation, a finding that garnered our attention. This suggests that these macrophage subtypes are likely to function as TAMs and participate in tumor angiogenesis.

Confirmation of our hypothesis was provided by the analysis of the VEGFA_VEGFR2_Signaling_Pathway (Fig. S16A–C). The pathway pseudotemporal analysis revealed that the BC_myCAF_CHIT1 + subpopulation in BC, the CC_myCAF_FCER1A + subpopulation in CC, and the OC_matCAF_FCN1 + subpopulation in OC all significantly contribute to angiogenic pathways. Additionally, the BC_Macro_CHIT1+, CC_Macro_FCER1A+, OC_Macro_FCN1 + macrophage populations were all found to highly express the MMP9 gene. MMP-9 is secreted in its zymogen or inactive form by endothelial cells, leukocytes, fibroblasts, neutrophils, and macrophages. The synthesis of MMP-9 typically occurs in the bone marrow during granulocyte differentiation46. Matrix metalloproteinases (MMPs), including MMP9, MMP2, MMP3, and MMP7, are known to degrade components of the extracellular matrix and play a crucial role in tissue remodeling during pathological processes such as inflammation, tissue repair, tumor invasion, and metastasis, emerging as key regulators of angiogenesis and tumor progression45,46. Therefore, in BC, CC, and OC samples, TAMs such as Macro_CHIT1+, Macro_FCER1A+, and Macro_FCN1 + are involved in the generation of blood vessels within the TME, thereby facilitating tumor progression.

Discussion

Currently, an increasing number of studies indicate that the interaction between CAFs and TAMs plays a crucial role in the progression of tumors47,48,49,50. Traditionally, CAFs subpopulations are mainly classified based on their primary functions in different cancers into iCAF and myCAF51,52. These CAFs subtypes exhibit distinct characteristics within the TME. The heterogeneity of TAMs in the TME endows them with various functions, such as phagocytosis, antigen presentation, cytokine production, and tissue remodeling43. Previous studies using scRNA-seq have identified multiple TAM subtypes with different gene signatures and functions. Some studies also typically use pan-markers such as CD68, CD163, or CD204 to determine the subtypes and proportions of macrophages and their specific functions53,54. Although there have been extensive studies on CAFs and TAMs, the molecular classification of CAFs and TAMs varies across different types of cancer, and to date, there has been no study specifically focusing on CAFs and TAMs in BC, CC, and OC Therefore, we utilized single-cell sequencing data from published studies in the GEO database to characterize the specific roles of CAFs and TAMs cells in these three diseases.

Research has shown that the primary function of matCAF is to secrete large amounts of collagen, making it a key organizer of the ECM33. Collagen is an important component of the tumor stroma and is recognized for its role in promoting tumor metastasis and progression56,57. Clinically, the unique biomarker profiles of each CAF subtype suggest that a high matCAF signature is associated with poor prognosis in almost all adenocarcinomas. A key function of matCAFs is collagen secretion, enhancing ECM formation33,55. myCAFs and iCAFs have been found to be two spatially distinct but interchangeable populations, with myCAFs located near cancer cells and exhibiting elevated α-SMA levels, while iCAFs located in the distal regions of the tumor show inflammatory features with high IL-6 and low αSMA levels57. Therefore, it is essential to accurately characterize CAF heterogeneity and subpopulations, identify different markers, and understand the functional roles of each population in different stages of disease progression.

In the study, we performed trajectory analysis on CAF subtypes, revealing their temporal patterns in the tumor environment. In BC and CC, iCAF is one of the earliest forms, followed by differentiation into myCAF status. In OC, proCAFs are one of the earliest forms, followed by myCAF and matCAF, with matCAF and myCAF appearing in the mid to late stages of tumor evolution. This indicates that as tumors progress, CAF subpopulations undergo dynamic changes, and CAF subtypes are not static during tumorigenesis but evolve over time.

Research indicates that aging CAFs exhibit a senescence-associated secretory phenotype (SASP), rich in pro-inflammatory cytokines, and iCAFs demonstrate robust performance under hypoxic conditions55,56. Given that aging and hypoxia are prevalent throughout tumor progression, this provides a basis for the evolution of iCAFs, and we have also found that only the iCAF subset is significantly enriched in the hypoxia pathway. Furthermore, in advanced tumors, iCAFs gradually decrease, with a higher prevalence of myCAF and matCAF subsets. Therefore, we discovered that the hypoxia in the TME leads to the abnormal activation of the iCAF subset and the release of pro-inflammatory factors, which stimulate macrophages to differentiate into M2 type, exhibiting pro-inflammatory characteristics and the ability to inhibit tumor cells. As the proportion of iCAFs increases and the proportion of myCAFs and matCAFs rises over time, these cell populations inhibit immune cells during the malignant progression of tumors, leading to a decrease in immune response.

Through functional enrichment analysis of ligand-receptor pairs across CAFs subtypes, we observed functional convergence in EMT and PI3K-Akt signaling pathways.These findings align with established oncogenic mechanisms: EMT enables tumor cells to acquire invasive properties via cytoskeletal reorganization and loss of epithelial polarity, while PI3K-Akt signaling regulates proliferation, survival, and metabolic adaptation57. As critical mediators of tumor-stroma crosstalk, CAFs interact with cancer cells through growth factor secretion (e.g., TGF-β, HGF) and extracellular matrix remodeling, creating a microenvironment conducive to EMT progression and PI3K-Akt activation58. Mechanistically, CAFs-derived signals induce EMT-associated transcriptional reprogramming, which in turn triggers PI3K-Akt pathway activation through receptor tyrosine kinase transactivation59.This reciprocal regulation establishes a self-amplifying loop wherein PI3K-Akt signaling further stabilizes EMT phenotypes by suppressing apoptosis and enhancing cell cycle progression58. Simultaneously, CAFs-secreted proteases (e.g., MMP2/9) degrade basement membranes, liberating matrix-bound growth factors that sustain both pathways59. The interdependence between CAFs, EMT, and PI3K-Akt signaling highlights potential therapeutic vulnerabilities, suggesting that combined targeting of stromal-derived factors (e.g., LOXL2 inhibitors) and oncogenic pathway nodes (e.g., PI3Kδ isoform suppression) may disrupt this protumorigenic network more effectively than monotherapeutic approaches.

Different TIME phenotypes, such as the immunoinflammatory microenvironment, the immunorejection microenvironment, and the immunological desert microenvironment, are associated with various cancer types and the efficacy of immunotherapy60,61. In particular, many recent studies have shown that different macrophage subtypes perform specific functions within tumors, making macrophage-targeted therapy an attractive immunotherapy strategy60. In our study, we extracted macrophages from each type of tumor. Since the distribution of macrophages varies, classifying them into the traditional M1 and M2 types may not align with our research.

Therefore, we defined seven macrophage subtypes through marker genes, among which we found that Macro_CHIT1+, Macro_FCER1A+, and Macro_FCN1 + exhibit TAMs characteristics. Moreover, pseudotime analysis revealed that these subgroups exist throughout macrophage differentiation. Thus, gaining a deeper understanding of the interactions between these cells and CAFs, as well as their unique functions, can promote the development of therapeutic methods for diseases.

In our study, to investigate the interactions between TAMs and CAFs, we employed the “CellChat” analysis. The analysis revealed that the COLLAGEN signaling pathway is the most highly interacting signaling pattern, with COL1A1-CD44 being the most significant ligand-receptor pair contributing to the signaling pathway. COLLAGEN is a major component of the ECM and is essential for the normal functioning of tissues. It plays a crucial role in maintaining the stability and integrity of tissues and organs. COL1A1 is a primary component of type I collagen, widely distributed in the interstitial and connective tissues of various solid organs throughout the body62. In a study of 97 BC patients, the expression level of COL1A1 was significantly higher in patients with metastasis than in those without63. Li et al.‘s research compared the expression profiles of COL1A1 mRNA in malignant tissues (55 cases), precancerous tissues (27 cases), and normal tissues (19 cases), showing that the expression level of COL1A1 was significantly higher in precancerous and malignant tissues than in normal tissues64. In another study, it was found that COL1A1 promotes epithelial-mesenchymal transition and enhances tumor cell metastasis through the activation of the AKT signaling pathway via ITGB145. Based on these established research findings, we conclude that the COLLAGEN signaling pathway is a major factor involved in the epithelial-mesenchymal transition that leads to the malignant progression of tumors.

In BC and CC samples, we found that the myCAF and iCAF subpopulations are involved in the COLLAGEN signaling pathway through autocrine and paracrine mechanisms, respectively. Initially, myCAFs secrete COL1A1, which acts on themselves, leading to abnormal activation of TGF-β, resulting in ECM deposition and promoting massive proliferation of cancer cells. Subsequently, the iCAF subpopulation at a distance from the tumor also receives the COL1A1 receptor signal, thereby releasing pro-inflammatory cytokines that lead to the polarization of macrophages into M2-type macrophages and elicit an immune response.

However, clinical studies have shown that TGF-β derived from CAFs has an immunosuppressive effect and can strongly influence the function of various immune cell types, including T cells, macrophages, and neutrophils65,66,67. Furthermore, we found that as the tumor progresses, the expression of CD44 gradually increases. Studies have indicated that CD44 is the most common surface marker of cancer stem cells (CSCs) and plays a key role in the communication between CSCs and the microenvironment, as well as in the regulation of stem cell properties68. Additionally, research has found that the influence of TAMs (tumor-associated macrophages) on CSC function depends on the hyaluronic acid (HA)-CD44 interaction and the expression of CD44 subtypes69. Therefore, we hypothesize that the Macro_CHIT1+, Macro_FCER1A+, and Macro_FCN1 + subpopulations are involved in the expression and signaling transduction of CD44, thereby regulating the stemness of CSCs and promoting tumor progression. In OC, we found that matCAF and myCAF both operate through an autocrine form in the COLLAGEN signaling pathway. We speculate that in the late stage of the tumor, it is primarily matCAF that secretes COL1A1 and acts on itself, followed by interaction with CD44, allowing the TAM subpopulations (Macro_CHIT1+, Macro_FCER1A+, Macro_FCN1+) to exert tumor-promoting functions. Therefore, through our analysis, COL1A1-CD44 is identified as a potential therapeutic target in BC, CC, and OC.

In our study, we also found a close relationship between CAFs and TAMs with the VEGFA_VEGFR2_Signaling_Pathway. TAMs have the potential to drive tumor angiogenesis, and in various mouse models, TAMs are a major source of MMP970. MMP9 promotes angiogenesis through its ECM degradation properties71. In vivo studies using Mmp9−/− mice have shown that MMP9 deficiency is associated with various abnormalities in processes related to cancer development, such as inflammation, wound healing, and vascular wall remodeling72,73. Shchors et al. observed that MMP9 gene deletion significantly impaired tumor growth and angiogenesis in two independent models of pancreatic neuroendocrine cancer, with MMP9-deficient tumors in both models exhibiting a significantly more aggressive phenotype compared to their wild-type counterparts74,75,76,77,78,79. In our study, TAMs secrete MMP9 to induce a VEGF-mediated angiogenic response, providing a continuous supply of nutrients and energy for the proliferation of tumor cells. Therefore, targeting the TAMs-angiogenesis axis offers an excellent therapeutic opportunity for cancer treatment.

In summarily, this study harnessed scRNA-seq technology to explore the intrinsic cellular dialogues within BC, CC, and OC samples, delivering an exhaustive analysis of the specialized roles that ligand-receptor pairs play in signaling pathways. Our results indicate that BC, CC, and OC are characterized by the COL1A1-CD44 interaction within the COLLAGEN signaling framework, and that CAFs and TAMs engage in dialogue via the VEGF pathway to furnish the essential substrates for cancerous cell proliferation. The revelation of these regulatory elements promises to deepen our comprehension of the biological underpinnings of disease progression and to furnish a conceptual foundation for innovative therapeutic strategies. Nonetheless, the absence of subsequent experimental corroboration necessitates an acknowledgment of the inherent limitations of our study. Therefore, our study shares the common drawbacks inherent in research that solely relies on bioinformatics analysis. Primordially, our conclusions are derived exclusively from bioinformatics examinations of disease samples, and the integrity of these findings is contingent upon the accuracy and bias-free nature of the input data; any discrepancies or biases in the raw data could compromise the veracity of the analysis. Secondly, the choice of algorithm and parameterization is critical, as varying algorithms employ distinct parameters and models, and an ill-informed selection may yield erroneous outcomes. Furthermore, the inappropriate specification of parameters in data models can engender overfitting to the training dataset, diminishing the model’s predictive utility with respect to new data. Finally, the lack of uniform standards across different analytical tools and databases complicates data harmonization and raises potential ethical and privacy concerns. In pursuit of future research endeavors, we are committed to mitigating these challenges and to augmenting the validation of our analytical findings through increased biological experimentation, thereby enhancing the general applicability of our discoveries.