Introduction

Lung cancer remains a significant global public health challenge and is the leading cause of cancer-related mortality worldwide, affecting both men and women. In 2023, an estimated 127,070 new deaths from lung cancer are projected1. Pathologically, lung cancer is classified into two main types: non-small cell lung cancer (NSCLC) and small cell lung cancer. NSCLC is further categorized into adenocarcinoma, squamous cell carcinoma, and large cell carcinoma, with adenocarcinoma accounting for approximately 40% of lung cancer cases2. Current treatment modalities for lung adenocarcinoma (LUAD) include surgery, radiotherapy, chemotherapy, targeted therapies, and immune checkpoint inhibitors (ICIs), which are the preferred treatments3. Recently, ICIs and targeted therapies have shown promising results in treating advanced LUAD, with some patients achieving remission4. However, acquired drug resistance significantly limits the effectiveness of these therapies, contributing to the persistently low 5-year overall survival rate in LUAD patients5,6. Despite the promise of these therapies, their diminishing efficacy highlights the urgent need for the discovery of novel, highly sensitive early diagnostic biomarkers, as well as the development of new therapeutic strategies aimed at improving survival and quality of life for affected individuals. This challenge has become a key focus in clinical lung cancer research, prompting the need for innovative solutions.

A critical aspect of cancer research is the identification of the molecular players involved in tumor progression. RNA-binding motif (RBM) proteins, a specialized subgroup of RNA-binding proteins (RBPs), have recently gained considerable attention in cancer biology7. These proteins are extensively involved in various biological processes, including RNA splicing, stability, transport, and regulation of translation8. Emerging evidence suggests that RBM proteins play crucial roles in tumor initiation and progression, with abnormal expression closely linked to cancer metastasis, therapy resistance, and overall disease progression8. For instance, RBM5 and RBM10 have been identified as tumor suppressors in lung cancer, where they regulate RNA splicing and the expression of apoptosis-related genes9,10. Additionally, RBM proteins modulate immune responses within the tumor microenvironment (TME), influencing immune evasion and treatment efficacy11,12.

Despite growing recognition of the significance of RBM proteins in cancer, the specific mechanisms and prognostic implications of these proteins in LUAD remain insufficiently understood, warranting further investigation. Among the RBM family members, RBMS3 has garnered particular interest due to its potential tumor-suppressive functions. As an RNA-binding protein, RBMS3 has been implicated in the regulation of several cancers13,14,15, and participates in critical cellular processes, including proliferation, apoptosis, and migration, by modulating RNA splicing and stability16. RBMS3 is underexpressed in a variety of cancers, including ovarian17, gastric18, esophageal19, breast20, and lung squamous carcinoma21, and its expression levels correlate with multiple clinical indicators and patient prognosis, highlighting its role in cancer progression. However, the specific role of RBMS3 in LUAD remains poorly understood, and its underlying mechanisms are yet to be fully elucidated22.

In this study, we aimed to investigate the biological functions and regulatory mechanisms of RBM proteins, with a particular focus on RBMS3, in the development and progression of LUAD. We employ transcriptome analysis, bioinformatics approaches, and cell-based experiments to explore these mechanisms. Our findings could provide valuable insights for the development of novel diagnostic and therapeutic strategies for LUAD.

Materials and methods

Data collection

RNA-seq data and clinical information were obtained from The Cancer Genome Atlas (TCGA) database, including 515 LUAD and 59 normal tissue samples. After excluding samples with missing survival data, a total of 504 LUAD and 57 normal tissues samples were retained for analysis. Additionally, the GSE116959 dataset was downloaded from the Gene Expression Omnibus (GEO) database, which included 49 LUAD and 11 normal tissues samples. The GSE11969 dataset was downloaded from the GEO database. A total of 94 LUAD samples with gene expression data, survival data, and clinical information were included in the study. Expression levels of 11 RBM genes were analyzed, and 10 genes were selected for subsequent experiments.

Differential expression analysis

The gene expression matrix from the 504 LUAD samples was analyzed using the R package "limma." Differentially Expressed Genes (DEGs) were identified based on the criteria of |log2(fold change)|≥ 0.5 and an adjusted P-value ≤ 0.05. To visualize these DEGs, a volcano plot was constructed using the R package “ggplot2”, and a heatmap was generated using the “pheatmap” package. This approach was applied to the combined dataset of 561 samples from the TCGA database, 60 samples from the GSE116959 dataset, as well as genes stratified into high-risk and low-risk groups. The DEGs screened from the TCGA database were DEGs1, and the DEGs screened from the GSE116959 database were DEGs2. The candidate key genes were obtained by taking the intersection of DEGs1 and the 11 genes of the RBM family. To understand the expression trends of the two candidate key genes in the disease, we analyzed the expression levels of the two candidate key genes in the disease group and the control group using "the Wilcoxon test method" in the TCGA—LUAD and GSE116959 datasets respectively. Among them, the difference was significant when the P-value was less than 0.05.

Weight gene co-expression network analysis (WGCNA)

WGCNA was performed to assess the gene co-expression patterns in the TCGA dataset. A hierarchical clustering approach was employed, with a height threshold of 20,000 to exclude obvious outlier samples, retaining 542 samples for subsequent analysis. A soft-thresholding power value was chosen to construct a scale-free network, with a degree of independence greater than 0.90. Gene modules were identified using a dynamic tree-cutting algorithm, with a minimum of 30 genes required per module. Correlations between these gene modules and clinical traits were then established. The adjacency matrix was subsequently transformed into a topological overlap matrix to further investigate the interrelationships among genes. Finally, the verboseScatterplot function was used to generate a scatterplot for the visual representation of these correlations.

Functional enrichment analysis

DEGs in the MEblue module were identified by intersecting them with the DEGs from both the TCGA and GEO databases. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted using “ClusterProfiler” package (version 3.14.3) in R to annotate the biological functions of the DEGs within the MEblue module, as well as those within high- and low-risk groups23,24,25. Relevant bar charts were generated to visualize the enrichment results using the “ggplot2” R package. Additionally, protein–protein interaction (PPI) network analysis was performed to explore the functional relationships of the DEGs in the MEblue module.

Construction of prognostic model

RBMS3 was identified as a significant factor associated with patient survival through univariate Cox regression analysis. Based on this finding, we incorporated RBMS3 and relevant clinical factors to construct a risk score model using multivariate Cox regression analysis. LUAD patients from the TCGA database were subsequently classified into low-risk and high-risk groups according to the median risk score. To assess the predictive accuracy of the model, Kaplan–Meier (K-M) survival analysis was performed to compare survival outcomes between the two groups. Additionally, time-dependent receiver operating characteristic (ROC) curves were generated to evaluate the model’s performance. In order to further validate the effectiveness of the risk model, in the tumor samples of the validation set (GSE11969), the R package “survivalROC” was used to draw the ROC curves for 1-year, 3-year, and 5-year survival, and the AUC of the model was calculated to verify the effectiveness of the model. To further evaluate the predictive accuracy of the risk model, the R package “rms” (v6.5.0)26 was used to construct the 1-year, 3-year, and 5-year calibration curves of the risk model for the TCGA-LUAD tumor samples. The calibration curves were used to visually display the relationship between the model’s predicted probabilities and the actually observed event incidence rates, and to determine whether there were problems such as overfitting or underfitting in the model.

Immune cell infiltration

CIBERSORT is a computational tool used to estimate the relative proportions of different immune cell types within a mixed cell population. TIMER, a web-based resource, provides data on the abundances of six specific immune cell types—B cells, CD8 + T cells, CD4 + T cells, dendritic cells, neutrophils, and macrophages—within the TME. This resource is widely utilized to assess the role of immune cells in various types of cancer. In our study, CIBERSORT was used to estimate the proportions of 22 immune cell subtypes in both high- and low-risk groups, and TIMER was employed to analyze the correlation between immune cell infiltration and RBMS3 expression.

Cell culture and transient transfection

Human NCI-H1395 and A549 cell lines were purchased from iCell Biotechnology. The NCI-H2087, PC-9, and human bronchial epithelioid (HBE) cell lines were purchased from Procell. All cell lines were cultured in RPMI-1640 medium (Hyclone) supplemented with 10% fetal bovine serum (FBS, Gibco) and 1% penicillin–streptomycin (Gibco), and maintained in 5% CO2 at 37 °C. The cells were seeded in 6-well plates and grown to approximately 70% confluence before being harvested for transient transfection. Lipofectamine 2000 (Life Technologies) was used to transfect Negative Control (NC) and OE-RBMS3 (GenePharma) plasmids into A549 and PC-9 cells according to the manufacturer’s protocol. The target sequences for the overexpression construct are listed in Supplementary Table 1.

RNA extraction and quantitative real‐time PCR (qRT‐PCR)

Total RNA was extracted from the cells using TRIzol reagent (TIANGEN) according to the manufacturer’s instructions. The RNA was then reverse transcribed into complementary DNA (cDNA) using the FastQuant RT Kit (TIANGEN). qRT-PCR was performed to amplify the cDNA using SupeReal PerMix Plus (TIANGEN). The PCR conditions were as follows: an initial denaturation step at 95 °C for 15 min, followed by 40 cycles of 95 °C for 30 s and 60 °C for 60 s. The 2−ΔΔCt method was used to quantify RBMS3 expression levels. GAPDH was used as an endogenous control. The following primers were used: RBMS3 forward: 5′-TCCAAACCAAGCAGTCCTATGC-3′, reverse: 5′-GATTAGGTCCTGGTCAGTGGTG-3′; GAPDH forward: 5′-GAAGGTCGGAGTCAACGGATTT-3′; reverse: 5′-GCCATGGGTGGAATCATATTGG-3′.

Immunohistochemistry (IHC)

Human LUAD tissue sections were deparaffinized in xylene and rehydrated through a graded ethanol series. The sections were then heated in a citric acid buffer (biosharp, 0.01 M) at 105 °C for 10 min to retrieve the antigens. To block endogenous peroxidase activity, the tissue sections were treated with a 3% hydrogen peroxide solution. Subsequently, the sections were incubated with 10% goat serum albumin for 30 min to reduce nonspecific binding, The tissue sections were then incubated overnight at 4 °C with anti-RBMS3 primary antibody (Abcam, 1:200). After washing with PBS, the sections were incubated with a orseradish peroxidase (HRP)-conjugated secondary antibody. Finally, the sections were stained with diaminobenzidine (DAB) and examined under a microscope at 100 × magnification.

Western blotting

Western blotting was performed to examine the overexpression levels of RBMS3. Total proteins were extracted from cells using RIPA lysate (Solarbio), and their concentrations were measured using a bicinchoninic acid (BCA) protein assay. Equal amounts of protein were separated by SDS-PAGE and transferred to polyvinylidene fluoride (PVDF, Millipore) membranes. The membranes were blocked with 5% nonfat dry milk and incubated with primary antibodies: mouse anti-GAPDH (1:3000, Affinity) and rabbit anti-RBMS3 (1:1000, Abcam) at 4 °C overnight. After three times washes with TBST, the PVDF membranes were incubated with a secondary antibody (Abcam, 1:3000) for 2 h at 37 °C. The immunoreactivity of PVDF membranes was detected using enhanced chemiluminescence (ECL, Millipore).

CCK8 assays

Cell proliferation was quantified using the Cell Counting Kit-8 (CCK-8, DOJINDO). A549 and PC-9 cells (5 × 103 cells/well) were seeded into 96-well culture plates and incubated at 37 °C with 5% CO2 for 24, 48, and 72 h. Subsequently, 10 μL of CCK-8 solution was added to each well, and the plates were incubated at 37 °C for an additional 2 h. The absorbance of each well was measured at a wavelength of 450 nm.

Transwell invasive and wound healing assays

Transwell invasion and wound healing assays were performed to evaluate the invasion and migration capabilities of RBMS3 in cell lines. For the invasion assays, 100 μL of cell suspension was added to the upper chamber (BD), and 500 μL of medium containing 10% fetal bovine serum (FBS) was added to the lower chamber. After 24 h of incubation, the membrane was fixed with 4% paraformaldehyde for 10 min and then stained with 0.1% crystal violet for 10 min. Three randomly selected regions were counted under an inverted microscope (Nikon). For the wound healing assays, 3 μL of cell suspension (1 × 106 cells/well) was plated in 6-well plates and a wound was created by scraping the cells with a sterile pipette tip. The cells were washed three times with 0.01 M PBS to remove detached cells and then incubated in a 37 °C, 5% CO2 incubator. Wound closure was observed and recorded using an inverted microscope at 0, 12, and 24 h.

Cell cycle and apopyosis analyses

Flow cytometry was performed to analyze the cell cycle and apoptosis. Seventy-two hours after transfection, the NC and OE-RBMS3 cells were collected and digested into a single-cell suspension. After discarding the supernatant, the cells were stained with Cystain DNA 1-step staining solution (1 mL, Beyotime) for cell cycle analysis using flow cytometry. For apoptosis analysis, the NC and OE-RBMS3 cells were washed with PBS, treated with trypsin, and the supernatant was discarded. The cells were then stained with propidium iodide (PI, 20 µg/mL, DOJINDO) for 15 min in the dark.

Statistical analysis

Statistical analysis was performed using GraphPad Prism 7 software. Each experiment was repeated three times, and data were expressed as the mean ± standard deviation (SD). The K-M method was used to generate survival curves, and differences between groups were compared using the log-rank test. Multivariate analysis was performed using the Cox proportional hazards model. A P-value of less than 0.05 was considered statistically significant.

Results

Identification of co-differentially expressed RBM genes

To identify DEGs between LUAD and normal tissues, we utilized the “limma” package to analyze data from the TCGA database. Applying the criteria of |log2(fold change)|≥ 0.5 and adjusted P-value ≤ 0.05, a total of 4,456 DEGs were identified, comprising 2,535 up-regulated and 1,921 down-regulated genes (Fig. 1a). Detailed information on these DEGs is provided in Supplementary Table 2. The expression profiles of the top 50 up-regulated and down-regulated genes were visualized in a heatmap (Fig. 1b). After performing an intersection analysis, RBMS2 and RBMS3 were identified as co-differentially expressed RBM genes (Fig. 1c). Moreover, the expression levels of RBMS2 and RBMS3 in LUAD and normal tissues were consistent with the TCGA DEGs results (Supplementary Table 3). Next, the expression levels of RBMS2 and RBMS3 were analyzed in the datasets TCGA—LUAD and GSE116959. The results showed that both RBMS2 and RBMS3 were down—regulated in tumors (Supplementary Figs. 1–2).

Fig. 1
figure 1

Comparative analysis of differentially expressed genes (DEGs) in LUAD vs. normal tissues. (a) DEGs between lung adenocarcinoma (LUAD) and normal tissue samples were identified using stringent selection criteria: |log2(fold change)|≥ 1.0 (horizontal dashed lines) and an adjusted P-value ≤ 0.05 (vertical dashed line, -log10(adj.P) = 1.3). A total of 4,456 DEGs were identified, including 2,535 significantly upregulated (red, log2FC > 0) and 1,921 downregulated genes (blue, log2FC < 0). The tumor suppressor RBMS3 (arrow) exhibited marked downregulation (log2FC = -1.82, adjusted P = 3.2 × 10⁻5) and was prioritized for further analysis. (b) Heatmap illustrating the expression profiles of the top 50 most significantly upregulated and downregulated genes in LUAD compared to normal tissues. The color gradient represents expression levels, where red indicates higher expression and blue indicates lower expression. (c) Venn diagram depicting the overlap of the DEGs with RBM family genes, identifying RBMS2 and RBMS3 as co-differentially expressed RBM genes. This highlights their potential involvement in the pathogenesis of LUAD.

Identification of key modules in LUAD

We performed WGCNA using the TCGA database to identify the key modules associated with LUAD. To construct a scale-free network for gene interactions, we selected a soft threshold power of β = 5 (scale-free R2 = 0.9) (Fig. 2a). A total of thirty-eight modules were identified through co-expression matrix analysis (Fig. 2b). As shown in Fig. 2c, the MEpink module exhibited the strongest positive correlation with LUAD (correlation coefficient = 0.72, P = 3e-87), whereas the MEblue module showed the strongest negative correlation (correlation coefficient = -0.83, P = 6e-141). Scatterplots of gene significance (GS) versus module membership (MM) for the MEpink and MEblue modules demonstrated strong correlations (MEpink: correlation coefficient = 0.89, P = 1e-200; MEblue: correlation coefficient = 0.91, P = 1e-200; Fig. 2d). Based on these results, the MEblue and MEpink modules were selected for further analysis.

Fig. 2
figure 2

Identification of key modules associated with LUAD using WGCNA. (a) Selection of the soft threshold power (β) for WGCNA analysis. A soft threshold power of β = 5 was chosen to ensure that the gene interaction network followed a scale-free topology (scale-free R2 = 0.9). (b) Co-expression network analysis identified thirty-eight distinct modules through WGCNA, each representing clusters of co-expressed genes. (c) Correlation analysis between the identified modules and LUAD. The MEpink module exhibited the highest positive correlation with LUAD (correlation coefficient = 0.72, P = 3e-87), while the MEblue module displayed the highest negative correlation with LUAD (correlation coefficient = -0.83, P = 6e-141). (d) Scatterplots illustrating the relationship between gene significance (GS) and module membership (MM) within the MEblue and MEpink modules. Both the MEblue (correlation coefficient = 0.89, P = 1e-200) and MEpink (correlation coefficient = 0.91, P = 1e-200) modules showed strong correlations, highlighting their relevance to LUAD.

Biological functional analysis of MEblue DEGs

In this study, we identified two modules closely associated with LUAD. To further explore their relevance, we intersected these key modules with RBM genes. The results revealed that both the MEblue module and RBM genes were present in this intersection (Fig. 3a). Consequently, the MEblue module was selected for further analysis. We then overlapped the DEGs from the GSE116959 and TCGA databases with the genes from the MEblue module. This analysis identified 651 co-differentially expressed genes, including RBMS3 (Fig. 3b).

Fig. 3
figure 3

Identification of LUAD-associated modules and RBM genes intersections. (a) Both RBMS2 and RBMS3 were identified within the MEblue module, indicating their strong association with LUAD. (b) Overlap analysis of DEGs between the GSE116959 and TCGA datasets and the MEblue module genes revealed 651 co-DEGs, including RBMS3. (c) Functional enrichment analysis of MEblue DEGs using Gene Ontology (GO) revealed significant enrichment in the canonical Wnt signaling pathway and the regulation of the canonical Wnt signaling pathway, both of which play pivotal roles in cancer development. (d) Enrichment analysis of MEblue DEGs using the KEGG database indicated that these genes are significantly involved in several cancer-related signaling pathways, including Proteoglycans in cancer, MAPK signaling pathway, Renal cell carcinoma, and Transcriptional misregulation in cancer. e PPI network analysis illustrated the interactions and functional roles of DEGs within the MEblue module. RBMS3 was identified as a central gene within this network, underscoring its potential pivotal role in LUAD development.

Subsequently, we performed functional enrichment analysis on the MEblue DEGs. GO enrichment analysis showed significant enrichment in the canonical Wnt signaling pathway and its regulation, both of which are known to play critical roles in cancer development27 (Fig. 3c). KEGG pathway analysis revealed that the MEblue DEGs were prominently associated with cancer-related signaling pathways, including proteoglycans in cancer, the MAPK signaling pathway, renal cell carcinoma, and transcriptional misregulation in cancer (Fig. 3d). The PPI network further illustrated the interactions and functions of DEGs in the MEblue DEGs, with RBMS3 emerging as a key gene (Fig. 3e).

In summary, the DEGs in the MEblue module were directly or indirectly involved in cancer-related processes. Notably, RBMS3 was implicated in several cancer-related signaling pathways, while RBMS2 was absent from these enriched pathways. These findings suggest that RBMS3 plays a significant role in the initiation and progression of LUAD.

Prognostic value of RBMS3 and risk score

To further investigate the relationship between RBMS3 expression and clinical characteristics, we analyzed the expression levels of RBMS3 across different clinical categories (TNM stage, clinical stage) using data from the TCGA database. Our analysis revealed that RBMS3 expression was significantly higher in both TNM and clinical stages compared to normal tissues (Fig. 4a–d). This finding suggests that RBMS3 expression is associated with disease progression, and we hypothesize that it may serve as a prognostic biomarker for LUAD.

Fig. 4
figure 4

Correlation analysis of RBMS3 expression with clinical parameters in cancer patients from the TCGA database. (a) Association between RBMS3 expression levels and tumor size (T stage). (b) Association between RBMS3 expression levels and lymph node metastasis (N stage). (c) Association between RBMS3 expression levels and distant metastasis (M stage). (d) Association between RBMS3 expression levels and overall clinical stage.

We next examined the prognostic value of RBMS3 in LUAD through univariate analysis using Cox proportional hazards models. The results indicated that RBMS3 expression was significantly correlated with LUAD prognosis (HR < 1, P < 0.05) (Fig. 5a). Subsequently, we performed a multivariate Cox analysis, incorporating both RBMS3 and clinical factors (Stage, pathologic-T, pathologic-N, pathologic-M, Sex). The analysis identified Stage, pathologic-N, and RBMS3 as key factors for constructing a risk score model (Fig. 5b). Using the median risk score, LUAD patients from the TCGA database were classified into low-risk and high-risk groups. Kaplan–Meier survival analysis demonstrated a significant difference in overall survival (OS) between the two groups, with the high-risk group exhibiting a notably shorter survival time than the low-risk group (P < 0.0001, Figs. 5c, d). Moreover, as shown in Fig. 5e, RBMS3 expression was significantly lower in the high-risk group compared to the low-risk group. The risk score model exhibited good predictive accuracy, as evidenced by the area under the ROC curve (AUC) values for 1-, 3-, and 5-year survival rates, which were 0.691, 0.678, and 0.703, respectively (Fig. 5f). Meanwhile, an independent validation set (GSE11969) was selected for the validation of the risk model. As could be seen from the area under the ROC curve (AUC) values of the 1-year, 3-year, and 5-year survival rates, they were 0.708, 0.707, and 0.708 respectively (Supplementary Fig. 3). Meanwhile, the calibration curve analysis indicated that the risk model had good calibration and could predict the survival probability of patients relatively accurately, further validating the reliability of the model (Supplementary Fig. 4).

Fig. 5
figure 5

Prognostic analysis of RBMS3 in LUAD using cox proportional hazards models. (a) Univariate Cox proportional hazards analysis demonstrated that elevated RBMS3 expression is significantly associated with improved overall survival in LUAD patients (Hazard Ratio [HR] = 0.766, 95% Confidence Interval [CI]: 0.602–0.973, P = 0.029). Log-rank test further validated the survival advantage in high RBMS3 expression groups (P = 0.027). (b) Multifactorial Cox regression analysis was performed, incorporating RBMS3 and other clinical indicators (tumor size T, lymph node metastasis N, gender, distant metastasis M, tumor stage). Insignificant factors were excluded, leaving tumor stage, lymph node metastasis N, and RBMS3 as the retained factors for the final model. (c) Kaplan–Meier survival curve illustrating a significant difference in overall survival (OS) between high-risk and low-risk groups, as determined by the risk score model (P < 0.0001). (d) Comparison of survival times between high-risk and low-risk groups, showing that patients in the high-risk group have significantly shorter survival times compared to those in the low-risk group. (e) Comparison of RBMS3 expression levels between high-risk and low-risk groups, indicating that RBMS3 expression is significantly lower in the high-risk group. (f) The area under the receiver operating characteristic (ROC) curve (AUC) for 1-year, 3-year, and 5-year survival rates were 0.691, 0.678, and 0.703, respectively. These AUC values, all exceeding 0.65, demonstrate that the multifactorial Cox risk model incorporating RBMS3 has strong predictive performance.

Biological functional analysis of high- and low-risk groups DEGs

To explore the molecular differences between high- and low-risk groups, we utilized the “limma” package to identify DEGs, and the results are visualized in a volcano plot (Fig. 6a). The expression levels of these DEGs were presented in a heatmap (Fig. 6c). We then performed GO analysis to explore the biological processes (BPs) associated with the DEGs. The top 30 enriched terms revealed that DEGs were primarily associated with multicellular organismal homeostasis, humoral immune response, protein processing, and regulation of T cell proliferation, all of which are integral to maintaining homeostasis and cellular immunity (Fig. 6b). Additionally, KEGG pathway analysis highlighted several relevant cancer-related pathways, as shown in Fig. 6d.

Fig. 6
figure 6

Differential gene expression and functional enrichment analysis between high- and low-risk groups. (a) Volcano plot displaying the results of DEGs analysis using the “limma” package, highlighting genes with significant upregulation and downregulation. (b) GO analysis of DEGs, highlighting the top 30 enrichment results. These results reveal associations with multicellular organismal homeostasis, humoral immune response, protein processing, and the regulation of T cell proliferation—processes crucial for maintaining homeostasis and cellular immunity. c Heatmap illustrating the expression patterns of DEGs across high- and low-risk groups, providing a visual representation of gene expression differences. d KEGG pathway analysis emphasizing pathways significantly associated with the DEGs, highlighting key biological processes involved in cancer progression.

Composition of immune cells in LUAD patients

Previous studies have shown that DEGs are involved in immune regulation processes. To further explore the immune landscape in LUAD we analyzed the proportions of 22 immune cell types in high- and low-risk groups using the CIBERSORT algorithm. The TME cell composition results revealed that both the high- and low-risk groups had high proportions of plasma cells, CD8 + T cells, CD4 + memory resting T cells, and macrophages (M0, M1, and M2). In contrast, B cells (naive), follicular helper T cells, resting NK cells, and neutrophils were present at lower levels (Fig. 7a). The violin plot in Fig. 7b illustrates the differential expression levels of these 22 immune cell types between the high- and low-risk groups. Notably, the high-risk group exhibited significantly higher numbers of eosinophils and resting dendritic cells compared to the low-risk group (P < 0.05, P < 0.01). Next, we used the TIMER tool to assess the correlation between RBMS3 expression and immune cells. As shown in Fig. 7c, each dot represents a sample from the high- and low-risk groups. The blue line indicates a positive correlation between RBMS3 expression and the immune cell types. Specifically, RBMS3 expression was positively correlated with B cells (cor = 0.16, P = 4.25 × 10⁻4), CD8 + T cells (cor = 0.214, P = 1.86 × 10⁻⁶), CD4 + T cells (cor = 0.24, P = 8.99 × 10⁻⁸), macrophages (cor = 0.341, P = 1.07 × 10⁻14), neutrophils (cor = 0.277, P = 5.71 × 10⁻10), and dendritic cells (cor = 0.369, P = 3.70 × 10⁻17).

Fig. 7
figure 7

Analysis of immune cell composition and correlation with RBMS3 expression in high- and low-risk groups. (a) Tumor microenvironment (TME) cell composition analysis using the CIBERSORT algorithm, showing the proportions of 22 immune cell types in high- and low-risk groups. Higher proportions of plasma cells, CD8 + T cells, resting CD4 + memory T cells, and various macrophage subtypes (M0, M1, and M2), while the fractions of naive B cells, follicular helper T cells, resting NK cells, and neutrophils were comparatively lower. (b) Violin plot illustrating the differential expression levels of the 22 immune cell types between high- and low-risk groups. Significantly increased levels of eosinophils and resting dendritic cells were observed in the high-risk group compared to the low-risk group (“ns” indicates non—significance, "*" indicates P < 0.05, and "**" indicates P < 0.01). (c) Correlation analysis between RBMS3 expression and immune cells using the TIMER tool. Each dot represents a sample from high- and low-risk groups. The blue line indicates a positive correlation between RBMS3 expression and several immune cell types, including B cells (cor = 0.16, P = 4.25 × 10⁻4), CD8 + T cells (cor = 0.214, P = 1.86 × 10⁻⁶), CD4 + T cells (cor = 0.24, P = 8.99 × 10⁻⁸), macrophages (cor = 0.341, P = 1.07 × 10⁻14), neutrophils (cor = 0.277, P = 5.71 × 10⁻10), and dendritic cells (cor = 0.369, P = 3.70 × 10⁻17).

Overexpression of RBMS3 regulated proliferation and invasion of LUAD Cell Lines

To further validate and complement the bioinformatics analyses, we assessed the expression of RBMS3 in LUAD cell lines (NCI-H1395, A549, PC-9, NCI-H2087) using qRT-PCR. As shown in Fig. 8a, RBMS3 expression was significantly lower in LUAD cells lines compared to normal cells (HEB). Immunohistochemistry (IHC) analysis further confirmed the reduced expression of RBMS3 in LUAD tissues (Fig. 8b). Based on these findings, A549 and PC-9 cell lines were selected for subsequent experiments. We transfected A549 and PC-9 cells with an RBMS3 overexpression plasmid (Fig. 8c). CCK-8 assay was used to quantify cell proliferation capacity. The results showed that the absorbance values of A549 and PC-9 cells overexpressing RBMS3 (OE-RBMS3) were significantly lower than those of the control group at 24, 48, and 72 h (P < 0.05) (Fig. 8d), indicating a significant reduction in proliferation rate. In the Transwell invasion assay, the number of A549 cells that passed through the chamber membrane was reduced by approximately 66.53%, and PC-9 cells by 52.46%, compared to the control group (P < 0.01) (Fig. 8e). The wound healing assay also demonstrated that the wound closure rate at 12 and 24 h was significantly lower in OE-RBMS3 cells than in the control group (P < 0.05) (Fig. 8f), confirming that OE-RBMS3 inhibits cell migration and invasion. For cell apoptosis, Annexin V-FITC/PI dual staining combined with flow cytometry analysis revealed that the total proportion of early and late apoptotic cells increased by 6.53% in A549 cells and 8.57% in PC-9 cells following OE-RBMS3 (P < 0.001) (Fig. 9a). In cell cycle analysis, OE-RBMS3 led to a decrease in the proportion of G1-phase cells in A549 cells from 44.6% to 36.87% (P < 0.01) and in PC-9 cells from 49.83% to 37.13% (P < 0.01), while the proportion of S-phase cells increased by 6.07%, and G2/M-phase cells increased by 1.7% (A549) and 6.67% (PC-9) (Fig. 9b). These results indicate that OE-RBMS3 promotes G1-to-S phase transition rather than inducing G1-phase arrest, thereby affecting cell cycle progression and inhibiting cell proliferation.

Fig. 8
figure 8

Experimental validation of RBMS3 expression and its role in LUAD Progression. (a) qRT-PCR analysis of RBMS3 expression in LUAD cell lines (NCI-H1395, A549, PC-9, NCI-H2087) compared to normal cells (HEB), confirming significantly lower expression levels in LUAD cell lines. (b) Immunohistochemistry results demonstrating reduced RBMS3 expression in LUAD tissues compared to normal lung tissues. (c) Transfection of A549 and PC-9 cells with an RBMS3 overexpression plasmid, confirmed by Western blotting. (d) Proliferative ability of A549 and PC-9 cells transfected with OE-RBMS3 was significantly decreased compared to the control (NC). (e, f) Transwell invasion (A549: 66.53% reduction, P < 0.001; PC-9: 52.46% reduction, P < 0.01) and wound healing assays showing significantly reduced invasion and migration abilities in A549 and PC-9 cells with OE-RBMS3 compared to the NC.

Fig. 9
figure 9

The impact of RBMS3 overexpression on A549 and PC-9 cell apoptosis and cell cycle progression. (a) Annexin V-FITC/PI dual staining combined with flow cytometry analysis revealed that the total proportion of early and late apoptotic cells increased by 6.53% in A549 cells and 8.57% in PC-9 cells following OE-RBMS3 (P < 0.001). (b) OE-RBMS3 led to a decrease in the proportion of G1-phase cells in A549 cells from 44.6% to 36.87% (P < 0.01) and in PC-9 cells from 49.83% to 37.13% (P < 0.01), while the proportion of S-phase cells increased by 6.07%, and G2/M-phase cells increased by 1.7% (A549) and 6.67% (PC-9). (*P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001).

Discussion

Lung cancer is a leading causes of cancer-related mortality worldwide, posing a major public health burden1. Despite advancements advances in imaging and therapeutic approaches, including surgery, chemotherapy, and targeted therapies, the prognosis for advanced-stage lung cancer remains poor28. Early detection remains a challenge, as current diagnostic methods often fail to identify lung cancer at its initial stages. Additionally, drug resistance, limited efficacy, and severe side effects continue to hinder treatment success29. The molecular heterogeneity of lung cancer further complicates treatment, underscoring the need for in-depth studies to elucidate tumor progression mechanisms and develop novel targeted therapies to enhance patient outcomes30.

This study aimed to elucidate the role of RBM family genes in lung cancer development, progression, and prognosis. By examining how these genes regulate RNA splicing, stability, and translation, we sought to identify potential biomarkers and therapeutic targets. This research is significant because RBM genes, such as RBM10, function as tumor suppression, and their mutations are associated with poor clinical outcomes in lung cancer31. Our findings are expected to elucidate RBM protein-mediated signaling pathways in lung carcinogenesis, offering new insights into their biological functions. These findings may facilitate the development of novel diagnostic tools and targeted therapies, ultimately improving clinical management and survival outcomes in lung cancer. This study advances our understanding of cancer biology and bridges molecular discoveries with clinical applications, potentially impacting oncology practice.

RNA sequencing analysis of TCGA data identified 4,456 DEGs, including 2,535 upregulated genes and 1,921 downregulated genes. Intersection with 11 RBM family genes highlighted RBMS2 and RBMS3 as key regulators (Fig. 1). RBMS2 and RBMS3 expression in LUAD tissues was consistent with TCGA-derived DEGs (Supplementary Table 3). Functional enrichment analysis of MEblue module DEGs identified key pathways, and protein–protein interaction (PPI) network analysis confirmed RBMS3 as a critical hub gene (Fig. 3e). RBMS3 is highly expressed in normal tissues but significantly downregulated in solid tumors. Its expression in tumors is negatively correlated with tumor size, lymph node metastasis, distant metastasis, and advanced clinical stage (Figs. 4, 5), suggesting a potential tumor-suppressive role. RBMS3 overexpression in lung cancer cells reduced proliferation, induced apoptosis, and promoted G1-to-S phase transition. Transwell and wound healing assays demonstrated that RBMS3 overexpression significantly inhibited the migration of A549 and PC-9 lung cancer cells (Fig. 8), supporting its potential role in suppressing cancer metastasis. RBMS3 functions as a key hub in the PPI network, interacting with multiple proteins to regulate tumorigenesis and progression. RBMS3 directly binds to the 3’ untranslated region (3'UTR) of Twist1 mRNA, downregulating its expression and consequently inhibiting matrix metalloproteinase-2 (MMP2), thereby suppressing tumor migration and invasion16. Additionally, RBMS3 downregulates β-catenin, cyclin D1, and c-Myc, leading to reduced tumor cell proliferation and migration32,33. In NSCLC, RBMS3 loss cooperates with BRAFV600E mutations to drive tumorigenesis. RBMS3 silencing upregulates Ctnnb1, Ccnd1, Axin2, Lgr5, and c-Myc mRNA, indicating that RBMS3 depletion enhances Wnt/β-catenin signaling, thereby promoting malignant progression22. RBMS3 is a downstream target of the LKB1/AMPK pathway and plays a critical role in suppressing epithelial-mesenchymal transition (EMT), thereby reducing lung cancer invasion and metastasis34. These findings underscore RBMS3 as a pivotal RNA-binding protein involved in cell cycle regulation, apoptosis, and migration, exerting significant tumor-suppressive effects.

Our study revealed that RBMS3 expression was significantly downregulated in LUAD tissues relative to normal tissues. This finding aligns with previous studies demonstrating RBMS3 downregulation across multiple solid tumors and its inverse correlation with tumor size, lymph node metastasis, distant metastasis, and clinical stage. Liang et al. reported comparable findings in lung squamous cell carcinoma (LSCC), demonstrating that RBMS3 suppression contributed to tumor progression. They suggested RBMS3 as a potential diagnostic biomarker for LSCC21. However, RBMS3 function appears to be tumor type-specific. Lv et al. observed that RBMS3 downregulation in lung cancer correlated positively with histological grade, clinical stage, and N stage, and poor overall survival34. Our findings support the role of RBMS3 as a tumor suppressor in LUAD, demonstrating its ability to inhibit proliferation and promote apoptosis, consistent with its function in other cancers.

While RBMS3 is associated with favorable prognosis in lung cancer, its role in regulating the tumor immune microenvironment and immunotherapy resistance remains unclear. Immune checkpoint inhibitors (ICIs) have improved 5-year survival rates by ~ 30% in advanced NSCLC35. However, primary and acquired resistance mechanisms limit their efficacy36. The tumor microenvironment (TME) critically influences immunotherapy response. ICI resistance is linked to an increased presence of M2-type tumor-associated macrophages (TAMs) and impaired dendritic cell (DC) function. Elevated resting DCs contribute to immune evasion, a key mechanism underlying PD-1/PD-L1 resistance37. Identifying novel therapeutic targets to enhance antitumor immunity in NSCLC is crucial for improving patient outcomes. Elevated eosinophils and resting DCs in high-risk LUAD patients may drive immune checkpoint resistance. Eosinophils, despite their complex role in cancer immunology, have been implicated in modulating the TME38. They secrete cytokines and chemokines39, which can either promote or inhibit tumor growth. In immune checkpoint resistance, eosinophils interact with T cells and macrophages, modulating the immune microenvironment and promoting resistance40. Resting DCs, essential for antigen presentation and T-cell activation41, are elevated in high-risk LUAD patients. However, an increased number of resting DCs correlates with reduced T-cell activation42, weakening the anti-tumor immunity and promoting immune evasion43. Impaired antigen presentation by DCs enables cancer cells to evade immune surveillance, a central mechanism of checkpoint inhibitor resistance43. In summary, increased eosinophils and resting DCs in high-risk LUAD may drive immune checkpoint resistance. Further investigations are required to elucidate the underlying mechanisms and develop targeted approaches to enhance ICI efficacy.

Emerging evidence highlights the immunomodulatory role of RBMS3. Zhou et al. reported that RBMS3 knockout enhances T cell-mediated tumor cell cytotoxicity, underscoring its essential role in immune surveillance13. TME profiling in NSCLC showed enrichment of plasma cells, CD8 + T cells, resting memory CD4 + T cells, and macrophages (M0, M1, and M2), whereas immature B cells, follicular helper T cells, resting natural killer cells, and neutrophils were less prevalent. High-risk patients exhibited increased eosinophils and resting DCs relative to low-risk patients. RBMS3 expression correlated positively with multiple immune cell populations, including B cells, CD8 + T cells, CD4 + T cells, macrophages, neutrophils, and DCs (Fig. 7). In addition to its association with immune infiltration, RBMS3 actively regulates immune responses in multiple cancers. In ovarian cancer, RBMS3 overexpression enhances immune cell infiltration by upregulating M1 macrophage markers and downregulating M2 macrophage markers, regulatory T cells (Tregs), and myeloid-derived suppressor cells (MDSCs), thereby promoting antitumor immunity44. In triple-negative breast cancer, RBMS3 directly binds to the 3’ untranslated region (3'UTR) of PD-L1 mRNA, stabilizing it and inducing PD-L1 overexpression13. This effect is amplified by CD39 and CD73 upregulation, establishing an immunosuppressive network that weakens antitumor immunity13. In NSCLC, our findings indicate that reduced RBMS3 expression is associated with diminished DC activation, which could contribute to impaired antigen presentation and immune evasion. Considering the dynamic nature of the TME and the role of immune infiltrates in immunotherapy response45, RBMS3 acts as a context-dependent regulator of the tumor immune landscape. These findings support RBMS3 as a prognostic biomarker and therapeutic target, although further studies are needed to elucidate its immunoregulatory role and therapeutic relevance.

RBMS3 has gained significant attention for its role in oncogenic pathways and potential as a prognostic biomarker in lung cancer. Our study identified RBMS3 as a key regulator of cancer-related pathways, particularly the Wnt/β-catenin signaling pathway (Fig. 3). WGCNA analysis identified RBMS3 within the MEblue module, which is highly associated with LUAD and multiple cancer-related GO terms and KEGG pathways. Prior studies have shown that RBMS3 interacts with key components of the Wnt signaling cascade, corroborating our findings. RBMS3 influences β-catenin stability in the Wnt pathway, and its dysregulation in cancer cells is linked to aberrant Wnt signaling activation46. These findings indicate that RBMS3 functions as an upstream regulator of Wnt-driven oncogenesis, reinforcing its tumor-suppressive role. In the MAPK signaling pathway, RBMS3 may regulate phosphorylation events, affecting key downstream signaling processes47; however, its precise molecular mechanisms remain to be elucidated. RBMS3 likely modulates upstream regulators and downstream effectors of MAPK signaling, thereby influencing cell proliferation, differentiation, and survival48. Our study identified a co-expression relationship between RBMS3 and genes in the Wnt/MAPK pathways, suggesting its integration into these oncogenic networks. These findings provide mechanistic insights into RBMS3 regulation in LUAD and highlight the need for further experimental validation to clarify its role in these pathways. Transposon mutagenesis studies indicate that RBMS3 silencing accelerates BRAFV600E-driven lung cancer progression, suggesting that RBMS3 loss may enhance tumor growth and metastasis22. This is consistent with bioinformatics findings by Zhang et al., showing significantly reduced RBMS3 expression in small cell lung cancer compared to normal tissues49. These findings suggest that RBMS3 expression is regulated by upstream miRNAs, further supporting its tumor-suppressive role in lung cancerr49. Despite its therapeutic potential in NSCLC, further research is needed to elucidate RBMS3’s molecular mechanisms and develop targeted therapies leveraging its tumor-suppressive function.

Beyond lung cancer, recent studies identify RBMS3 as a key post-transcriptional regulator of PD-L1, contributing to immune evasion in breast cancer13. RBMS3 modulates tumor immune evasion by regulating PD-L1 expression and interacting with immune cells, thereby shaping the TME. RBMS3 also plays a critical role in EMT, a key driver of lung cancer progression34.AMP-activated protein kinase (AMPK) has been reported to inhibit lung cancer invasion and metastasis by upregulating RBMS3 expression34. AMPK, a key regulator of cellular energy metabolism and tumor suppression, exhibits dysregulated expression and altered functionality in various cancers50. Dysregulated AMPK signaling is implicated in gastric cancer and hepatocellular carcinoma progression51,52. These findings reinforce RBMS3 as a promising therapeutic target, providing opportunities for advancing cancer immunotherapy and targeted therapies.

CRISPR/Cas9-mediated RBMS3 silencing has been shown to promote BRAFV600E-driven lung organoid expansion and accelerate malignant progression in genetically engineered mouse models (GEMs)22. These findings underscore RBMS3’s role in tumorigenesis and support the application of gene-editing strategies in functional cancer genomics. However, challenges remain in CRISPR/Cas9 applications, particularly off-target effects, which necessitate further refinement for clinical translation53. Furthermore, small-molecule modulators of RBMS3 are largely unexplored. Future research should prioritize high-throughput screening for RBMS3-targeting small molecules and explore nucleic acid-based approaches, including siRNA and miRNA. Expanding these approaches will offer novel insights into RBMS3 regulation, including potential epigenetic modifications and protein–protein interactions, thereby broadening its therapeutic applicability.

Advances in spatial omics, single-cell resolution techniques, and multimodal data integration have deepened our understanding of complex biological processes. Among these, spatial CITE-seq enables simultaneous profiling of protein expression and full-transcriptome data at single-cell resolution54. This approach enhances spatial resolution in gene expression and cellular function analysis by integrating protein and transcriptomic profiling55. In lung cancer research, spatial CITE-seq can delineate the spatial organization of TME components and their molecular profiles, providing insights into RBMS3’s regulatory role. This method allows for the characterization of the molecular landscape of RBMS3-expressing cells, identifying key molecular markers and pathways involved in its regulatory network. Additionally, multimodal tri-omics mapping has proven valuable in analyzing spatial dynamics of complex biological systems, including mammalian brain development and immune regulation56. The integration of genomics, transcriptomics, and proteomics facilitates a holistic understanding of biological systems57. A multimodal approach to RBMS3 research can elucidate its molecular regulatory network in lung cancer, clarifying its spatial and temporal interactions within the tumor microenvironment. Furthermore, Perturb-DBiT, a spatially resolved CRISPR-based screening technique, allows for in vivo functional genomic analysis58. This technology enables high-resolution mapping of gene function and cellular phenotypes within tissues, refining our understanding of RBMS3’s role in lung cancer progression. Future studies should leverage spatially resolved CRISPR screening in in vivo models to dissect RBMS3’s role in tumor progression, invasion, and metastatic potential across distinct microenvironments, further validating its mechanistic contributions58.

While this study offers new insights into RBMS3’s tumor-suppressive role in LUAD, several limitations should be considered. The reliance on TCGA and GEO datasets may introduce population-specific biases, potentially limiting generalizability. However, integrating multiple independent datasets strengthens the robustness of our conclusions. Future studies should incorporate larger, more diverse patient cohorts to improve validation and generalizability. Although in vitro studies provide strong evidence for RBMS3’s role in LUAD, further in vivo validation, rescue experiments, and clinical investigations are essential to confirm its biological significance. The lack of immunohistochemistry (IHC) and immunofluorescence (IF) analyses restricts our ability to comprehensively characterize RBMS3 expression and function in tumor tissues. Future research should incorporate IHC, IF, and in vivo assays to refine the characterization of RBMS3 in LUAD progression. This study primarily focused on the MEblue module, which is enriched for RBM family genes (RBMS2 and RBMS3), aligning with our research objectives. Although the MEpink module is also associated with LUAD, its lack of RBM family genes limited its relevance to this study. Future investigations should explore MEpink-associated pathways to gain a broader understanding of LUAD pathogenesis. Despite these limitations, this study identifies RBMS3 as a key regulator in LUAD and provides a foundation for its clinical translation. Future research should integrate multi-omics approaches, expand experimental validation, and refine predictive models to advance precision medicine in LUAD. Ultimately, these findings aim to inform personalized therapeutic strategies, enhancing patient survival and quality of life.

In conclusion, this study identified RBMS2 and RBMS3 as key differentially expressed genes in LUAD, with RBMS3 serving as a strong prognostic marker. Functional analyses demonstrated its tumor-suppressive role in inhibiting LUAD cell proliferation, invasion, and migration. Moreover, its involvement in oncogenic signaling and immune regulation underscores its broader biological significance. While these findings offer valuable insights, further in vivo and clinical validation is necessary. Future studies should integrate multi-omics approaches and computational modeling to refine RBMS3’s prognostic accuracy and facilitate clinical translation.