Unveiling diagnostic biomarkers and therapeutic targets in lung adenocarcinoma using bioinformatics and experimental validation

Wu, Sixuan; Tang, Yuanbin; Pan, Qihong; Zheng, Yaqin; Tan, Yeru; Pan, Junfan; Li, Yuehua

doi:10.1038/s41598-025-05227-2

Download PDF

Article
Open access
Published: 02 July 2025

Unveiling diagnostic biomarkers and therapeutic targets in lung adenocarcinoma using bioinformatics and experimental validation

Sixuan Wu^1,2^na1,
Yuanbin Tang¹,
Qihong Pan²,
Yaqin Zheng²,
Yeru Tan¹,
Junfan Pan² &
…
Yuehua Li¹

Scientific Reports volume 15, Article number: 22893 (2025) Cite this article

1183 Accesses
1 Citations
Metrics details

Subjects

Non-small-cell lung cancer

Abstract

Lung adenocarcinoma (LUAD) is a major challenge in oncology due to its complex molecular structure and generally poor prognosis. The aim of this study was to find diagnostic markers and therapeutic targets for LUAD by integrating differential gene expression analysis, weighted gene co-expression network analysis (WGCNA), and machine learning methods. Differentially expressed genes (DEGs) were identified through the analysis of gene expression data from the Gene Expression Omnibus (GEO) database. Ten of the gene co-expression modules constructed by WGCNA were identified, with the red module having the most significant correlation with clinical features. In addition, a machine learning model constructed based on Stepglm[backward] with the random forest algorithm achieved the highest C-index (0.999) and screened eight core genes, among which ST14 was noted for its excellent predictive ability. Single-cell RNA sequencing of the LUAD samples further analyzed the expression patterns of these genes in 29 cellular subtypes, revealing their significant association with immune cell infiltration. Of particular note, the association of ST14 with clinical prognosis, drug responsiveness, and immune infiltration was validated, while enrichment analysis further clarified its role in key biological pathways. Ultimately, the expression of the core genes was validated experimentally. This study provides new insights into the pathogenesis of LUAD, clarifies potential diagnostic markers and therapeutic targets, and provides an important basis for future clinical interventions.

SMR-guided molecular subtyping and machine learning model reveals novel prognostic biomarkers and therapeutic targets in non-small cell lung adenocarcinoma

Article Open access 10 January 2025

Machine-learning and scRNA-Seq-based diagnostic and prognostic models illustrating survival and therapy response of lung adenocarcinoma

Article 29 July 2024

Analysis of prognostic model based on immunotherapy related genes in lung adenocarcinoma

Article Open access 21 December 2022

Introduction

Lung cancer represents a major health issue and is one of the leading causes of cancer-related mortality worldwide^1,2. Non-small cell lung cancer (NSCLC) constitutes the predominant subtype of lung cancer, representing approximately 85% of all cases^3,4,5. Among NSCLC cases, lung adenocarcinoma (LUAD) is the most prevalent type^6,7,8. Recent global data shows a steady increase in the incidence of LUAD, particularly among women, gradually surpassing squamous cell carcinoma as the most common type of lung cancer⁹. In 2020, lung cancer cases worldwide reached 2.2 million, with LUAD making up a significant proportion, especially among women, where its incidence reached 57%⁹. The high morbidity and mortality associated with LUAD not only severely impact patients’ quality of life but also place considerable strain on public health systems.

As early symptoms of lung cancer are often subtle, many patients are diagnosed at advanced stages, complicating both treatment and prognosis^10,11,12. Current treatment options, including surgery, chemotherapy, and targeted therapies, show limited efficacy, particularly in advanced stages of the disease^13,14. Additionally, the heterogeneity of LUAD further complicates treatment outcomes^15,16. Despite progress in genomic research, targeted therapies, and immunotherapy, the mechanisms underlying LUAD progression and drug resistance remain insufficiently understood. With the rapid advancement of precision medicine, statistical and computational algorithms have been increasingly employed to identify novel diagnostic and therapeutic targets^17,18. This study addresses this gap by employing bioinformatics analysis to identify potential biomarkers and diagnostic and therapeutic targets. By integrating machine learning techniques, we aim to enhance the diagnostic accuracy of LUAD and uncover key genes associated with prognosis and therapeutic response, ultimately improving patient outcomes.

The main goal of this study was to identify and analyze differentially expressed genes (DEGs) in LUAD using an integrated bioinformatics approach. Datasets from multiple Gene Expression Omnibus (GEO) databases were merged, and batch effect correction techniques were applied to enhance the consistency of gene expression across samples. LUAD requires the identification of new biomarkers and therapeutic targets to address its clinical challenges. In this study, 645 DEGs were identified, and the association between gene modules and clinical features was explored through weighted gene co-expression network analysis (WGCNA), revealing underlying biological processes. Additionally, a robust machine learning diagnostic model was constructed to identify eight core genes with significant predictive power by integrating these DEGs. The study further investigated the expression, prognosis, immune infiltration, and drug sensitivity of the ST14 gene among the core genes in LUAD. Quantitative Reverse Transcription Polymerase Chain Reaction (qRT-PCR) experiments were conducted to validate the expression of the eight core genes in LUAD cell lines, Beas-2B control cells, and 30 pairs of LUAD tumors with adjacent normal tissues. The novelty of this study lies in the integration of DEG identification, machine learning, and single-cell analysis to construct a powerful diagnostic model, exploring the immune landscape and drug sensitivity of LUAD, and offering new insights for future diagnostic and therapeutic strategies, as well as personalized medicine.

Materials and methods

Data acquisition

Clinicopathologic information and genome-wide expression data were obtained from five LUAD cohorts (GSE32665, GSE68465, GSE115002, GSE32863, and GSE75037), all through the GEO database¹⁹ (https://www.ncbi.nlm.nih.gov/geo/). We used the “sva” package to correct for batch effects for both the GSE32665 and GSE68465 data sets²⁰. First, multiple expression data files were read and merged to extract the intersecting genes, then the ComBat method was applied for batch correction, and finally the batch-corrected expression data were output. In addition, the GEO database provided single-cell RNA sequence (scRNA-seq) data of 9 LUAD patients in GSE189357. mRNA expression data and corresponding clinical information for LUAD patients were obtained from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/)²¹. The workflow of this study is illustrated in Fig. S1.

In this study, samples of LUAD tissues and corresponding adjacent non-cancerous tissues were obtained from 30 lung cancer patients at Fujian Provincial Cancer Hospital. The study received approval from the hospital’s Ethics Committee (Approval No. K2023-417-01), and informed consent was obtained from all participants before their inclusion in the study.

WGCNA analysis

WGCNA (version 1.73)²² was conducted to construct a co-expression network for LUAD samples. Normalized expression data were imported, low-variance genes were filtered, and sample group information was extracted. Genes and samples that did not meet quality control standards were excluded based on missing value assessment. A sample clustering tree was generated using hierarchical clustering, and outlier samples were removed by applying a predefined cut height. Clinical trait data were aligned with the expression matrix, followed by re-clustering and heatmap visualization to evaluate sample–trait relationships. To identify an appropriate soft-thresholding power, a range of candidate values was evaluated according to scale-free topology criteria. A weighted adjacency matrix was then constructed and converted into a topological overlap matrix to measure gene co-expression similarity. Genes were clustered into modules using hierarchical clustering with a minimum module size of 60. Modules with similar expression patterns were merged based on eigengene correlation using a threshold of 0.25, resulting in the final module structure. Pearson correlation coefficients were calculated to assess associations between module eigengenes and clinical traits, and the results were visualized in a module–trait correlation heatmap. Module membership and gene significance (GS) were further analyzed to identify key modules and hub genes associated with clinical features.

Diagnostic models of intersecting genes generated by an integrated machine learning-based approach and analysis of model genes

The LUAD data were analyzed using 113 machine learning methods. First, the gene expression data for both the training and validation groups were processed to ensure that genes common to both groups were retained, followed by centering and normalization of the data. Feature selection was then performed using machine learning algorithms such as random forest, glmnet, and xgboost to identify important genes, which were subsequently used to construct predictive models. The performance of each model was measured primarily by calculating the Area Under the Curve (AUC) value and comparing results between the training and validation sets. Risk scores for each sample were calculated, and classification predictions were made. Heatmaps were generated to visualize the AUC values of different models, with the best-performing model selected. The selected features were combined using multivariate logistic regression models to further enhance prediction accuracy. Additionally, the Receiver Operating Characteristic (ROC) curve analysis of the GEO dataset was conducted using the “pROC” package (version 1.18.5). The risk matrix file was read, and dataset ID information was extracted. Sample grouping information was then extracted and labeled as 1 and 0. For each model, ROC curves were generated, and the AUC values were determined, with confidence intervals obtained using the bootstrap method. The ROC curves for each dataset were plotted, with AUC values and 95% confidence intervals labeled.

Simultaneously, volcano plots of the model genes were plotted using the “ggplot2” (version 3.5.1) and “ggrepel” packages, while the gene expression data were analyzed using the “limma” (version 3.62.1), “reshape2” (version 1.4.4), “ggpubr”, and “PerformanceAnalytics” packages. The gene expression data and associated gene lists were read to extract the relevant model gene expressions. The sample grouping information was extracted, and the data were formatted for boxplot generation. Boxplots were plotted using ggboxplot to compare gene expression differences between groups, with significance levels labeled by the Wilcoxon test. Additionally, correlation plots were generated using the “PerformanceAnalytics” package to display the relationships between model gene expressions. Boxplots and correlation plots were then generated to visualize gene expression differences and correlations. Finally, the protein interaction networks of the model genes were explored using the GeneMANIA website²³ (https://genemania.org/) to gain deeper insights into gene interrelationships and biological functions.

Single-cell sequencing analysis

An in-depth analysis of model genes was conducted using single-cell RNA sequencing. The “Seurat” package (version 5.1.0) was applied to preprocess the data, removing low-quality cells, normalizing the data, and selecting variant genes. Principal component analysis (PCA) was then performed to reduce the data’s dimensionality, followed by the application of the Harmony algorithm to correct for batch effects. Cells were subsequently clustered using the t-SNE method to identify distinct cell populations. For cell type annotation, the “SingleR” package (version 2.8.0) was employed, integrating multiple reference datasets. The LUAD single-cell data were analyzed using Seurat, and the expression of model genes across different cell clusters was visualized. The expression patterns of these signature genes were presented using scatter plots (FeaturePlot) and bubble plots (DotPlot). In the scatter plot, gene expression levels were represented by a green-to-red gradient, while the bubble plot displayed gene expression intensity across clusters. Based on these visualizations, the distribution and expression differences of the target genes in distinct cell populations were clearly observed. For the single-cell analysis of the ST14 gene, eight distinct single-cell datasets were analyzed using the TISCH database²⁴ (http://tisch.comp-genomics.org/search-gene/) (GSE117570, GSE127465, GSE146100, GSE143423, GSE150660, GSE131907, GSE162498, GSE139955).

Expression, prognosis, and enrichment analysis of ST14

The analysis of ST14 expression and prognosis was conducted using data from the TCGA database. The “limma” and “ggplot2” packages were employed to analyze expression differences of the target gene ST14. First, gene expression data were processed to extract the ST14 gene, determine the sample sizes in the normal and tumor groups, and apply a log2 transformation. Box plots were used to visualize the expression differences of ST14 between the two groups, with statistical significance determined by the Wilcoxon test. Differential analysis was also conducted on paired samples using the “limma” and “ggpubr” packages, and paired difference analysis plots were generated to illustrate the expression variations of the target gene between the groups. Survival analysis of LUAD data was performed using the “survival” and “survminer” packages. Additionally, Gene Set Enrichment Analysis (GSEA) was applied to LUAD data, with samples grouped based on target gene expression. LogFC values were calculated, significantly enriched pathways were identified, and enrichment maps were generated. Gene Set Variation Analysis (GSVA) was also performed on LUAD data²⁵with gene set scores calculated for each group. Differential analysis was conducted, and significant pathways were displayed in bar charts.

Immune infiltration analysis and drug sensitivity analysis

The relationship between the ST14 gene and tumor microenvironment (TME) score was analyzed using the “ggplot2” and “limma” packages in R. Wilcoxon tests were performed to assess the statistical significance. Next, the relationship between ST14 and immune cell infiltration was examined. Box-and-whisker plots were created to show gene expression differences under varying immune cell infiltration types. The Spearman correlation between ST14 expression and immune cell infiltration types was calculated, and the results were reported. The association between ST14 and immune checkpoint (ICP) genes was additionally assessed, and a correlation matrix was plotted to highlight potential associations. Furthermore, the association between ST14 and tumor mutation burden (TMB) was analyzed, with scatter plots illustrating the correlation using Spearman correlation analysis. Lastly, the relationship between ST14 and immune scoring was examined using the “limma” and “ggpubr” packages, with violin plots generated and significance indicated. The “oncoPredict” package^26,27 was applied to examine the relationship between ST14 and drug sensitivity, and the final drug sensitivity analysis results were output.

Cell culture and qRT-PCR analysis

In this study, the Beas-2B, H1299, A549, PC9, and HCC827 cell lines were obtained from Procell (Wuhan, China). For cell resuscitation, the frozen storage solution was removed from a − 80 °C freezer and rapidly thawed in a 37 °C water bath. After thawing, the cells were immediately transferred to RPMI-1640 medium supplemented with 10% fetal bovine serum and 1% penicillin/streptomycin. To promote optimal cell growth, the cells were cultured in 10 cm dishes at 37 °C in a humidified incubator with 5% CO2. The medium was regularly replaced, and the cells’ growth was closely monitored to ensure they remained in the logarithmic growth phase. For total RNA extraction, TRIzol reagent (Invitrogen, Carlsbad, CA, USA) was used according to the manufacturer’s instructions. The extracted RNA was then reverse transcribed into complementary DNA (cDNA) using the PrimeScript RT kit (Takara). Quantitative reverse transcription polymerase chain reaction (qRT-PCR) was subsequently performed using the Takara SYBR Green method. The results were analyzed using the 2-ΔΔCt method, and all data were normalized to GAPDH as an internal control. The resulting relative gene expression values were used for subsequent analysis to evaluate the differential expression of genes across different cell lines. Primer sequences used in the qRT-PCR experiments are provided in Table S4.

Statistical analysis

Data processing, analysis, and visualization were performed using R software (version 4.4.2) and GraphPad Prism 9. Quantitative variables were analyzed by independent samples t-tests, and statistical significance was set at P < 0.05 for comparisons between groups.

Results

GEO-LUAD database DEGs identification and WGCNA analysis

Initially, the GSE32665 and GSE68465 datasets were integrated, and batch correction was applied to eliminate differences between the datasets. This correction significantly improved the consistency of gene expression distributions and aligned the datasets with the principal component space (Fig. S2). A total of 645 DEGs were detected using the criteria of |logFC| > 1 and adjusted p < 0.05 (Table S1), comprising 358 upregulated genes and 287 downregulated genes. The distribution of significantly up- and downregulated genes was visualized in a volcano plot (Fig. 1A), while the expression patterns of DEGs across different samples were illustrated in a heatmap (Fig. 1B). Subsequently, WGCNA was performed. A sample clustering tree, based on gene expression data, was constructed to detect potentially anomalous samples, providing a framework for data quality control in subsequent analyses (Fig. S3A). Additionally, the clustering tree, combined with a trait heatmap, revealed expression similarities between samples and their alignment with clinical features, establishing a foundation for module construction and trait correlation analysis (Fig. S3B). When the soft threshold β was set to 4, the network exhibited scale-free distribution characteristics (Fig. 1C). The clustering tree of module eigengenes further elucidated the relationships and similarities among different modules (Fig. S3C). Through the construction of co-expression modules and the aggregation of similar modules, a total of 10 gene modules were discovered (Fig. 1D). Correlation analysis between clinical traits and gene modules revealed that the red module demonstrated the strongest association with gene features (Cor = 0.28, p = 4e−13) (Fig. 1E). Further analysis of module gene significance confirmed that the red module exhibited the highest gene significance, indicating its strong correlation with the target trait (Fig. 1F). Additionally, a scatterplot illustrated the correlation between the module membership values of genes within the module and their gene significance in the treatment group (Fig. S3D). The red module comprised 493 genes in total (Table S2). Venn diagram analysis of the intersection between the red module genes and the DEGs identified 66 shared genes (Fig. 1G) (Table S3). The Gene Ontology (GO) analysis of intersecting genes identified vesicle-mediated transport in synapse and vesicle organization as the most enriched biological processes (BP). For cellular components (CC), the most prominent terms were early endosome, postsynaptic specialization, and neuron-to-neuron synapse. Regarding molecular functions (MF), the analysis highlighted peptide binding and amide binding as the most significant terms (Fig. S4A). The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of intersecting genes identified the most significantly enriched pathways as the renin-angiotensin system, TGF-beta signaling pathway, and starch and sucrose metabolism (Fig. S4B)^28,29,30.

Machine learning construction of diagnostic models and analysis of model genes

For the GEO datasets, training and test sets were constructed, and 113 predictive models were fitted using machine learning algorithms. The C-index of each model was subsequently calculated for both the training and validation sets. The results indicated that the optimal model was the combination of Stepglm[backward] and RF, achieving the highest average C-index (0.999) (Fig. 2A). Based on this model, six core genes were identified: BCMO1, FHL1, ST14, PPAP2C, SNCAIP, EML1, F10, and ADM2. The ROC curves demonstrated that the optimal model exhibited excellent performance in the training set and three validation sets (GSE75037, GSE32863, and GSE115002) (Fig. 2B–E). The AUC for the training set and two validation sets (GSE75037 and GSE32863) was 1.000, while the AUC for the GSE115002 validation set was 0.996. These results highlight the model’s exceptional diagnostic accuracy and robust generalization performance. Furthermore, the confusion matrix confirmed the model’s strong classification ability across different datasets (Fig. S5). Volcano and box plots revealed differential gene expression in the tumor group, where ST14, BCMO1, ADM2, and PPAP2C were upregulated, while FHL1, EML1, F10, and SNCAIP were downregulated (Fig. 2F,G). Additionally, the correlation analysis among the model genes was conducted (Fig. 2H). Figure 2I illustrates the ROC curves of the selected genes, with all genes achieving AUC values greater than 0.7. Among these, ST14 demonstrated the highest AUC (0.842), indicating its superior predictive performance. Moreover, the protein interaction network of the identified genes was constructed (Fig. 2J), further elucidating their functional relationships and biological relevance.

Single-cell analysis of model genes

The characterization of model genes in nine LUAD samples from the GSE189357 dataset was investigated using single-cell sequencing analysis. Quality control and PCA were conducted as part of the single-cell RNA sequencing workflow. Figure S6A illustrates the distribution of gene feature counts, total RNA counts, and mitochondrial gene ratios. Figure S6B highlights the highly loaded genes within the first four principal components of PCA. Figure S6C provides a PCA heatmap, revealing clustering patterns of gene expression across principal components, while Fig. S6D evaluates the significance of these components through p-values, serving as a guide for subsequent analyses. T-SNE plots identified 29 cellular subpopulations, highlighting their distribution characteristics and heterogeneity in two-dimensional space (Fig. 3A). Additionally, t-SNE plots revealed the distribution of distinct cell types in the single-cell RNA sequencing data, including monocytes, stromal cells, dendritic cells (DCs), fibroblasts, epithelial cells, macrophages, and BM cells, each demonstrating clear clustering patterns (Fig. 3B). The expression distribution of five key model genes—FHL1, SNCAIP, EML1, F10, and ADM2—was further analyzed within single-cell t-SNE plots (Fig. 3C). Bubble plots were utilized to illustrate the expression of these genes across various cellular subpopulations (Fig. 3D). In these plots, the vertical axis represents cellular subpopulations, the horizontal axis denotes the genes, the bubble size reflects the expression proportion of the genes in corresponding subpopulations, and the bubble color indicates the average expression level. Notably, FHL1 exhibited a higher expression proportion and average expression level in several subpopulations, such as subpopulations 8 and 12.

Expression profile, prognosis and enrichment analysis of ST14

Since the ST14 gene exhibited the best diagnostic efficacy in the model (AUC = 0.842), its role in LUAD was further investigated. As shown in Fig. 4A, the expression of the ST14 gene was higher in LUAD tumor tissues compared to normal tissues. Figure 4B further revealed that ST14 expression was also markedly higher in paired tumor tissues relative to normal tissues. Additionally, the data in Fig. 4C indicated that elevated ST14 expression may be linked to a poorer prognosis. GO analysis (Fig. 4D) revealed that the low ST14 expression group was notably enriched in immune-related biological processes, such as lymphocyte-mediated immunity and adaptive immune response, while the high ST14 expression group was primarily enriched in processes related to cell division, including mitotic nuclear division, mitochondrial-associated cellular components, and sister chromatid segregation. KEGG analysis (Fig. 4E) revealed that the high ST14 expression group was notably enriched in pathways involved in DNA replication, base excision repair, fructose and mannose metabolism, glutathione metabolism, and the pentose phosphate pathway. In contrast, the low ST14 expression group was significantly enriched in immune-related pathways, including hematopoietic cell lineage, cytokine-cytokine receptor interactions, chemokine signaling, systemic lupus erythematosus, and intestinal immune networks for IgA production. Figure 4F presents a bar plot of t-values for GSVA (Gene Set Variation Analysis) scores across different GO functions in ST14 samples. The t-values reflect the extent of enrichment or suppression of each gene set. Numerous GO terms associated with metabolic processes, cell differentiation, and immune responses exhibit upregulation (red), while terms related to calcium ion transport display downregulation (green). Gray bars indicate no significant change in these terms. Specifically, the top three upregulated GO terms include branching involved in labyrinthine layer morphogenesis, hexose catabolic process, and monosaccharide catabolic process. Figure 4G illustrates the t-value bar plots of GSVA scores for various KEGG pathways in ST14 samples. The t-values represent the degree of enrichment or inhibition of each pathway. Many KEGG pathways related to metabolic processes, signaling, and immune responses show upregulation (red), while others associated with cell migration and immune responses are downregulated (green). Gray bars indicate no significant changes in these pathways. Specifically, the top three upregulated KEGG pathways include pentose phosphate pathway, glycosphingolipid biosynthesis, and fructose and mannose metabolism.

Immunoinfiltration analysis of ST14

With the increasing focus on the TME, accumulating evidence suggests that tumor-infiltrating immune cells play a critical role in cancer initiation, progression, and aggressiveness^31,32. The correlation between ST14 and the TME was further explored. The analysis demonstrated that the ST14 high-expression group had significantly higher scores in StromalScore, ImmuneScore, and ESTIMATEScore compared to the low-expression group (Fig. 5A). Figure 5B illustrates the differences in ST14 gene expression across multiple immune cell types. Specifically, the ST14 low-expression group exhibited higher expression levels in Plasma cells, T cells gamma delta, and Mast cells resting, while the ST14 high-expression group had higher expression in T cells regulatory helper (Tregs), Macrophages M0, and Mast cells activated. Further correlation analysis demonstrated significant associations between ST14 expression and various immune cell types (Fig. 5C). Notably, ST14 showed a positive correlation with Macrophages M0 (r = 0.27, p = 3.8e−09), activated Mast cells (r = 0.15, p = 0.0012), NK cells resting (r = 0.099, p = 0.033), and T cells regulatory (Tregs) (r = 0.21, p = 8.3e−06), while it exhibited a negative correlation with resting Mast cells (r = − 0.12, p = 0.0081), Plasma cells (r = − 0.18, p = 8.9e−05), and T cells gamma delta (r = -0.21, p = 6.3e−06). Additionally, a significant correlation between ST14 expression and tumor mutation burden was observed (r = 0.12, p = 0.0057) (Fig. 5C). The lollipop plot (Fig. 5D) further highlighted the significant correlations between multiple immune cell types and ST14 gene expression. ST14 was positively correlated with Macrophages M0 (r = 0.27, p < 0.001), T cells regulatory (Tregs) (r = 0.21, p < 0.001), activated Mast cells (r = 0.15, p = 0.001), and NK cells resting (r = 0.10, p = 0.033), while it was negatively correlated with resting Mast cells (r = − 0.12, p = 0.008), Plasma cells (r = − 0.18, p < 0.001), and T cells gamma delta (r = − 0.21, p < 0.001). Furthermore, 12 ICP genes associated with ST14 were identified, with CD276 exhibiting the highest correlation coefficient (COR = 0.47) (Fig. 5E). Finally, the relationship between ST14 expression and immunotherapy efficacy (PD1 and CTLA4 treatment) was analyzed (Fig. 5F), and the results indicated a strong correlation between ST14 expression and the treatment efficacy of PD1 and CTLA4.

Drug sensitivity analysis and single-cell analysis of ST14

Based on the expression levels of ST14, LUAD samples were classified into high and low expression groups to examine their relationship with drug sensitivity (Fig. 6). The analysis indicated that the low ST14 expression group exhibited lower IC50 values for Linsitinib, Lenalidomide, Doxorubicin, Suboxone, Rucaparib, Palbociclib, OSI-027, and GSK2126458, suggesting increased sensitivity to these drugs. In contrast, the high ST14 expression group showed lower IC50 values for Cadrine, ERK_6604, BMS-754,807, and AZD2014, indicating greater sensitivity to these drugs. To investigate the expression pattern of ST14 across various cellular subpopulations in NSCLC, the single-cell database TISCH (http://tisch.comp-genomics.org/search-gene/) was used to analyze eight different single-cell datasets (GSE117570, GSE127465, GSE146100, GSE143423, GSE150660, GSE131907, GSE162498, GSE139955). As illustrated in Fig. 7, ST14 is predominantly expressed in monocytes, macrophages, malignant cells, and epithelial cells.

Model genes expression in LUAD cell lines and tissues

The expression levels of BCMO1, FHL1, ST14, PPAP2C, SNCAIP, EML1, F10, and ADM2 in LUAD cell lines (PC9, A549, H1299, HCC827) and Beas-2B control cells were analyzed using qRT-PCR. The results demonstrated that BCMO1, ST14, PPAP2C, and ADM2 were upregulated in LUAD cell lines, whereas FHL1, SNCAIP, EML1, and F10 were downregulated (Fig. 8A). Additionally, in 30 paired LUAD tumor and adjacent normal tissue samples, a similar expression pattern was observed: BCMO1, ST14, PPAP2C, and ADM2 were upregulated in tumor tissues, while FHL1, SNCAIP, EML1, and F10 were downregulated (Fig. 8B). Furthermore, immunohistochemical data from the HPA database corroborated the upregulation of ST14 and PPAP2C in lung cancer tissues, whereas FHL1, SNCAIP, and EML1 were found to be downregulated (Fig. S7).

Discussion

LUAD is a major subtype of NSCLC^33,34characterized by its high aggressiveness and poor prognosis. Despite advances in targeted therapies and immunotherapies, the heterogeneity of LUAD presents significant challenges to treatment efficacy and patient prognosis^35,36. Therefore, the identification of stable and clinically relevant molecular markers is essential for advancing individualized diagnosis and treatment.

The aim of this study was to systematically identify key genes associated with disease progression and immune regulation in LUAD by integrating multiple bioinformatics strategies and to evaluate their potential utility in diagnosis, prognosis, and therapeutic targeting. To achieve this, two LUAD gene expression datasets (GSE32665 and GSE68465) were integrated, and batch effect correction was performed, which markedly improved the consistency of gene expression across datasets. A total of 645 DEGs, including 358 upregulated and 287 downregulated genes, were identified through differential expression analysis, providing a foundation for subsequent investigations. Based on these DEGs, a weighted gene co-expression network was constructed, resulting in the identification of ten functional modules. Among them, the red module demonstrated the strongest correlation with LUAD clinical traits and showed a high level of consistency in both gene significance and module membership, suggesting that it may be enriched with functionally relevant genes. The intersection of the red module and DEGs was then used for machine learning modeling, leading to the development of a high-performance diagnostic model with an average AUC of 0.999 and the identification of eight core diagnostic genes.

To evaluate the biological relevance of these genes, single-cell RNA sequencing data were incorporated to assess their expression across distinct cellular subpopulations. The results revealed cell type–specific expression patterns within the tumor microenvironment, suggesting potential involvement in immunoregulatory processes and clinical applicability. Notably, ST14 exhibited consistent patterns across differential expression analysis, WGCNA, machine learning, and single-cell analysis, with its high expression significantly associated with poor prognosis, further supporting its potential as a diagnostic and prognostic biomarker for LUAD. Although certain analyses in this study remain at a preliminary, exploratory stage—particularly the single-cell validation, which was limited to expression-level assessment—the overall findings provide a solid foundation for future mechanistic research and clinical translation.

The findings of this study offer important understanding of the molecular mechanisms of LUAD and highlight the potential of specific pathways in guiding therapeutic strategies. KEGG pathway analysis of intersecting genes identified three key pathways: the renin-angiotensin system (RAS), the TGF-β signaling pathway, and carbohydrate metabolism, all of which are closely associated with tumor progression and patient prognosis. The renin-angiotensin system (RAS), traditionally recognized for its role in regulating blood pressure and fluid balance, has also gained attention for its involvement in cancer biology. It has been shown that RAS components can promote tumor growth and metastasis through mechanisms such as angiogenesis and immunomodulation³⁷. The significant enrichment of RAS-related genes in LUAD, observed in our analysis, suggests that targeting this pathway may offer novel therapeutic strategies, particularly in patients with elevated expression of RAS components. TGF-β can induce epithelial-to-mesenchymal transition (EMT), enhancing metastatic potential, and modulate immune responses within the tumor microenvironment^38,39,40. Finally, carbohydrate metabolic pathways, especially those involving fructose and mannose metabolism, have emerged as key regulators of cancer cell proliferation and survival⁴¹. Alterations in metabolic pathways are increasingly recognized as hallmarks of cancer, with metabolic reprogramming enabling tumor cells to thrive in nutrient-poor environments^42,43,44,45. In conclusion, the identification of these pathways deepens our understanding of LUAD biology and paves the way for the development of new targeted therapies.

Previous studies have demonstrated that ST14 is linked to tumor progression and poor prognosis in multiple cancer types^46,47,48. The association between elevated ST14 expression and reduced OS in LUAD patients further highlights its clinical relevance, underscoring the need for further investigation into its role in tumor biology. The immune status of LUAD is increasingly acknowledged as a key factor in tumor progression and patient prognosis⁴⁹. Our analysis highlighted a strong correlation between ST14 expression and several immune cell types, notably M0 macrophages, activated mast cells, and regulatory T cells (Tregs). M0 macrophages exhibit dual roles in the tumor microenvironment, displaying both pro- and anti-tumorigenic activities depending on their polarization status^50,51,52,53. The positive correlation between ST14 and M0 macrophages suggests that ST14 may influence macrophage polarization and function, potentially regulating tumor progression and patient prognosis. This association implies that ST14 could contribute to the formation of a tumor-promoting environment by modulating macrophage polarization. Previous studies have linked macrophage infiltration to poor prognosis across multiple cancer types^54,55. The relationship between ST14 and Tregs is particularly noteworthy. Tregs are known to suppress anti-tumor immune responses, enabling tumor cells to evade immune surveillance^56,57. The positive correlation observed in our study indicates that high ST14 expression may increase Treg populations, thereby exacerbating immune evasion in LUAD. This finding aligns with existing literature, which highlights the role of Tregs in promoting tumor growth and metastasis^58,59. Moreover, the association between ST14 and activated mast cells adds further complexity to the immune landscape of LUAD. Mast cells participate in various immune responses, including angiogenesis and tissue remodeling^60,61,62and their activation has been linked to tumor progression^63,64,65. The positive correlation between ST14 and activated mast cells suggests that ST14 may influence mast cell activity, potentially creating a microenvironment conducive to tumor growth. This is consistent with findings that mast cell infiltration is associated with poor prognosis in various malignancies^66,67. In summary, our findings emphasize the role of ST14 in shaping the LUAD immune microenvironment, particularly through its interactions with M0 macrophages, Tregs, and activated mast cells. These insights enhance our understanding of LUAD biology and suggest potential therapeutic targets for modulating the immune response in this malignancy. Future studies should focus on elucidating the precise mechanisms by which ST14 affects immune cell dynamics and explore its potential as a biomarker for immunotherapy response.

Immune checkpoints play a critical regulatory role in tumor immune surveillance⁶⁸. Among these checkpoint molecules, PD-1 (programmed cell death protein 1) and CTLA-4 (cytotoxic T lymphocyte-associated protein 4) are the most extensively studied^69,70. It has been demonstrated that the expression of these molecules influences the efficacy of immunotherapy⁷¹. We investigated the association between ST14 expression and the efficacy of immunotherapy, specifically PD-1 and CTLA-4 treatments. The results demonstrated a significant correlation between ST14 expression and therapeutic outcomes for PD-1 and CTLA-4, implying that ST14 could serve as a potential biomarker for predicting response to these therapies. Additionally, enrichment analysis identified that ST14 is associated with various biological processes and pathways, particularly those related to cell division and metabolic processes. These findings not only position ST14 as a potential prognostic marker but also as a candidate target for LUAD treatment. The observed differences in drug sensitivity based on ST14 expression levels further emphasize its potential role in personalized medicine, suggesting that ST14 could guide therapeutic decisions and improve patient prognosis. Moreover, gene expression was analyzed using single-cell analysis across multiple cell types derived from the tumor microenvironment. In tumor samples with significant cellular heterogeneity, single-cell data were downscaled and visualized using the t-SNE method. The results revealed expression patterns of six model genes (FHL1, SNCAIP, EML1, F10, ADM2), with particular emphasis on the expression of ST14. In single-cell t-SNE maps, ST14 expression was predominantly concentrated in monocytes, macrophages, malignant cells, and epithelial cells, indicating that ST14 plays a key role in the tumor immune microenvironment. These findings indicate that ST14 may regulate interactions between immune cells and tumor cells, contributing to tumor progression. Finally, the expression of eight model genes was analyzed in LUAD cell lines (PC9, A549, H1299, HCC827) and Beas-2B control cells, as well as in LUAD tumors and adjacent normal tissues using qRT-PCR. The expression patterns of these genes provide valuable insights into the pathogenesis of lung cancer and may serve as a theoretical foundation for developing novel therapeutic targets.

Although this study systematically constructed and validated potential biomarkers for LUAD through the integration of multi-omics data, several limitations remain. First, the analysis primarily relied on publicly available datasets, which may be subject to sample selection bias, incomplete clinical annotation, and inherent data heterogeneity. Second, although key genes were cross-validated in multiple independent datasets, supporting experimental evidence from prospective clinical samples is still lacking. Additionally, the single-cell RNA sequencing analysis was limited to expression distribution, without further investigation into the specific roles of these genes in cell fate determination, signaling pathway regulation, or immune function. Therefore, future studies should focus on validation in larger, independent prospective cohorts and incorporate functional experiments to further elucidate the biological roles of the identified genes in LUAD progression and therapeutic response.

Conclusion

This study underscores the utility of LUAD biomarkers in enhancing diagnostic precision and prognostic stratification through a comprehensive analysis of publicly accessible datasets, complemented by the application of WGCNA and advanced machine learning models. Notably, the core gene ST14 was significantly correlated with clinical features and immune cell infiltration, highlighting its pivotal role within the TME. Moreover, the integration of single-cell RNA sequencing data enabled a detailed exploration of gene expression heterogeneity across various cell types. While these findings pave the way for further investigation of therapeutic potential and drug sensitivity, the inherent limitations of the study should be acknowledged. Overall, this work contributes valuable insights into the biology of LUAD and its prospective clinical applications.

Data availability

The dataset employed for this study is readily available from online repositories. Detailed information regarding the dataset’s name and accession numbers can be found within the article for easy reference.

References

Cui, J. J. et al. Gene-gene and gene-environment interactions influence platinum-based chemotherapy response and toxicity in non-small cell lung cancer patients. Sci. Rep. 7, 5082. https://doi.org/10.1038/s41598-017-05246-8 (2017).
Article CAS PubMed PubMed Central Google Scholar
Zhang, H. et al. ARNTL2 is an indicator of poor prognosis, promotes epithelial-to-mesenchymal transition and inhibits ferroptosis in lung adenocarcinoma. Transl. Oncol. 26, 101562. https://doi.org/10.1016/j.tranon.2022.101562 (2022).
Article CAS PubMed PubMed Central Google Scholar
Duma, N., Santana-Davila, R. & Molina, J. R. Non-small cell lung cancer: epidemiology, screening, diagnosis, and treatment. Mayo Clin. Proc. 94, 1623–1640. https://doi.org/10.1016/j.mayocp.2019.01.013 (2019).
Article CAS PubMed Google Scholar
Lee, Y. S., Yeo, I. J., Kim, K. C., Han, S. B. & Hong, J. T. Inhibition of lung tumor development in ApoE knockout mice via enhancement of TREM-1 dependent NK cell cytotoxicity. Front. Immunol. 10, 1379. https://doi.org/10.3389/fimmu.2019.01379 (2019).
Article CAS PubMed PubMed Central Google Scholar
Chen, C., Jia, F., Hou, Z., Ruan, S. & Lu, Q. Delivery of Paeonol by nanoparticles enhances its in vitro and in vivo antitumor effects. Int. J. Nanomed. 12, 6605–6616. https://doi.org/10.2147/ijn.S143938 (2017).
Article CAS Google Scholar
Li, G. et al. Downregulation of NEDD4L by EGFR signaling promotes the development of lung adenocarcinoma. J. Transl. Med. 20, 47. https://doi.org/10.1186/s12967-022-03247-4 (2022).
Article CAS PubMed PubMed Central Google Scholar
Deng, Z. et al. Genomic comparison between cerebrospinal fluid and primary tumor revealed the genetic events associated with brain metastasis in lung adenocarcinoma. Cell. Death Dis. 12, 935. https://doi.org/10.1038/s41419-021-04223-4 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhang, J., Kuang, T., Dong, K., Yu, J. & Wang, W. Leveraging an immune cell signature to improve the survival and immunotherapy response of lung adenocarcinoma. J. Cancer. 15, 747–763. https://doi.org/10.7150/jca.90515 (2024).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Y. et al. Global variations in lung cancer incidence by histological subtype in 2020: a population-based study. Lancet Oncol. 24, 1206–1218. https://doi.org/10.1016/s1470-2045(23)00444-8 (2023).
Article PubMed Google Scholar
Cheng, G. et al. β-elemene suppresses tumor metabolism and stem cell-like properties of non-small cell lung cancer cells by regulating PI3K/AKT/mTOR signaling. Am. J. Cancer Res. 12, 1535–1555 (2022).
CAS PubMed PubMed Central Google Scholar
Xu, Z. et al. Efficacy of first-line treatments in the elderly and non-elderly patients with advanced epidermal growth factor receptor mutated, non-small cell lung cancer: a network meta-analysis. BMC Cancer. 22, 514. https://doi.org/10.1186/s12885-022-09592-3 (2022).
Article CAS PubMed PubMed Central Google Scholar
Ohgino, K. et al. Intracellular levels of reactive oxygen species correlate with ABT-263 sensitivity in non-small-cell lung cancer cells. Cancer Sci. 111, 3793–3801. https://doi.org/10.1111/cas.14569 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zhao, M. et al. miR-495 and miR-5688 are down-regulated in non-small cell lung cancer under hypoxia to maintain interleukin-11 expression. Cancer Commun. (Lond). 40, 435–452. https://doi.org/10.1002/cac2.12076 (2020).
Article PubMed Google Scholar
Chen, F. T. & Zhong, F. K. Kinesin family member 18A (KIF18A) contributes to the proliferation, migration, and invasion of lung adenocarcinoma cells in vitro and in vivo. Dis Mark. 2019, 6383685. https://doi.org/10.1155/2019/6383685 (2019).
Li, M. et al. Dual-energy spectral CT characteristics in surgically resected lung adenocarcinoma: comparison between Kirsten rat sarcoma viral oncogene mutations and epidermal growth factor receptor mutations. Cancer Imaging. 19, 77. https://doi.org/10.1186/s40644-019-0261-1 (2019).
Article CAS PubMed PubMed Central Google Scholar
Zhang, H., Shu, R., Liu, X., Zhang, X. & Sun, D. Downregulation of REV-ERBα is associated with the progression of lung adenocarcinoma. Ann. Transl. Med. 10, 56. https://doi.org/10.21037/atm-21-6405 (2022).
Article CAS PubMed PubMed Central Google Scholar
Liu, Z. et al. Uncovering the CeRNA network related to the prognosis of stomach adenocarcinoma among 898 patient samples. Biochem. Genet. 62, 4770–4790. https://doi.org/10.1007/s10528-023-10656-7 (2024).
Article CAS PubMed PubMed Central Google Scholar
Liu, Z. et al. The comprehensive and systematic identification of BLCA-specific SF-regulated, survival-related AS events. Gene 835, 146657. https://doi.org/10.1016/j.gene.2022.146657 (2022).
Article CAS PubMed Google Scholar
Barrett, T. et al. NCBI GEO: mining millions of expression profiles–database and tools. Nucleic Acids Res. 33, D562–566. https://doi.org/10.1093/nar/gki022 (2005).
Article CAS PubMed Google Scholar
Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The Sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883. https://doi.org/10.1093/bioinformatics/bts034 (2012).
Article CAS PubMed PubMed Central Google Scholar
Author Correction. Comprehensive molecular profiling of lung adenocarcinoma. Nature. 559, E12. https://doi.org/10.1038/s41586-018-0228-6 (2018).
Article CAS Google Scholar
Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 9, 559. https://doi.org/10.1186/1471-2105-9-559 (2008).
Article CAS Google Scholar
Mostafavi, S., Ray, D., Warde-Farley, D., Grouios, C. & Morris, Q. GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol. 9 (Suppl 1), 4. https://doi.org/10.1186/gb-2008-9-s1-s4 (2008).
Article CAS Google Scholar
Sun, D. et al. TISCH: a comprehensive web resource enabling interactive single-cell transcriptome visualization of tumor microenvironment. Nucleic Acids Res. 49, D1420–d1430. https://doi.org/10.1093/nar/gkaa1020 (2021).
Article CAS PubMed Google Scholar
Hu, J. et al. Siglec15 shapes a non-inflamed tumor microenvironment and predicts the molecular subtype in bladder cancer. Theranostics 11, 3089–3108. https://doi.org/10.7150/thno.53649 (2021).
Article CAS PubMed PubMed Central Google Scholar
Maeser, D., Gruener, R. F. & Huang, R. S. OncoPredict: an R package for predicting in vivo or cancer patient drug response and biomarkers from cell line screening data. Brief. Bioinform. 22 https://doi.org/10.1093/bib/bbab260 (2021).
Fu, J. et al. Large-scale public data reuse to model immunotherapy response and resistance. Genome Med. 12, 21. https://doi.org/10.1186/s13073-020-0721-z (2020).
Article PubMed PubMed Central Google Scholar
Kanehisa, M., Furumichi, M., Sato, Y., Matsuura, Y. & Ishiguro-Watanabe, M. KEGG: biological systems database as a model of the real world. Nucleic Acids Res. 53, D672–d677. https://doi.org/10.1093/nar/gkae909 (2025).
Article PubMed Google Scholar
Kanehisa, M. Toward Understanding the origin and evolution of cellular organisms. Protein Sci. 28, 1947–1951. https://doi.org/10.1002/pro.3715 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30. https://doi.org/10.1093/nar/28.1.27 (2000).
Article CAS PubMed PubMed Central Google Scholar
Liu, Z., Petinrin, O. O., Toseef, M., Chen, N. & Wong, K. C. Construction of immune Infiltration-Related LncRNA signatures based on machine learning for the prognosis in Colon cancer. Biochem. Genet. 62, 1925–1952. https://doi.org/10.1007/s10528-023-10516-4 (2024).
Article CAS PubMed Google Scholar
Liu, Z., Georgakopoulos-Soares, I., Ahituv, N. & Wong, K. C. Risk scoring based on DNA methylation-driven related DEGs for colorectal cancer prognosis with systematic insights. Life Sci. 316, 121413. https://doi.org/10.1016/j.lfs.2023.121413 (2023).
Article CAS PubMed Google Scholar
Shi, Y. X., Zhang, W. D., Dai, P. H., Deng, J. & Tan, L. H. Comprehensive analysis of KCTD family genes associated with hypoxic microenvironment and immune infiltration in lung adenocarcinoma. Sci. Rep. 12, 9938. https://doi.org/10.1038/s41598-022-14250-6 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zhao, H. et al. A five-cuproptosis-related LncRNA signature: predicting prognosis, assessing immune function & drug sensitivity in lung squamous cell carcinoma. J. Cancer. 14, 1499–1514. https://doi.org/10.7150/jca.82370 (2023).
Article CAS PubMed PubMed Central Google Scholar
Busca, S., Salleron, J., Boidot, R., Merlin, J. L. & Harlé, A. Somatic substitution signature as an innovative tool in lung cancer diagnosis. Sci. Rep. 9, 14561. https://doi.org/10.1038/s41598-019-51155-3 (2019).
Article CAS PubMed PubMed Central Google Scholar
Yao, Y. et al. Comprehensive analysis of prognostic biomarkers in lung adenocarcinoma based on aberrant lncRNA-miRNA-mRNA networks and Cox regression models. Biosci. Rep. 40 https://doi.org/10.1042/bsr20191554 (2020).
Zuern, C. et al. Microtubule associated tumor suppressor 1 deficient mice develop spontaneous heart hypertrophy and SLE-like lymphoproliferative disease. Int. J. Oncol. 40, 1079–1088. https://doi.org/10.3892/ijo.2011.1311 (2012).
Article CAS PubMed Google Scholar
Xiao, L. et al. Pentosan polysulfate ameliorates fibrosis and inflammation markers in SV40 MES13 cells by suppressing activation of PI3K/AKT pathway via miR-446a-3p. BMC Nephrol. 23, 105. https://doi.org/10.1186/s12882-022-02732-8 (2022).
Article CAS PubMed PubMed Central Google Scholar
Lho, Y. et al. Effects of TGF-β1 receptor inhibitor GW788388 on the epithelial to mesenchymal transition of peritoneal mesothelial cells. Int. J. Mol. Sci. 22 https://doi.org/10.3390/ijms22094739 (2021).
Jung, J. et al. Gas6 prevents epithelial-Mesenchymal transition in alveolar epithelial cells via production of PGE(2), PGD(2) and their receptors. Cells. 8 https://doi.org/10.3390/cells8070643 (2019).
Xie, H. et al. Proteomics analysis to reveal biological pathways and predictive proteins in the survival of high-grade serous ovarian cancer. Sci. Rep. 7, 9896. https://doi.org/10.1038/s41598-017-10559-9 (2017).
Article PubMed PubMed Central Google Scholar
Agarwal, S. et al. PAICS, a de Novo purine biosynthetic enzyme, is overexpressed in pancreatic Cancer and is involved in its progression. Transl. Oncol. 13, 100776. https://doi.org/10.1016/j.tranon.2020.100776 (2020).
Article PubMed PubMed Central Google Scholar
Jing, W. et al. Genetic profiling of breast cancer with and without preexisting metabolic disease. Transl. Oncol. 13, 245–253. https://doi.org/10.1016/j.tranon.2019.09.008 (2020).
Article PubMed Google Scholar
Sun, H. et al. Development and validation of a selenium metabolism regulators associated prognostic model for hepatocellular carcinoma. BMC Cancer. 23, 451. https://doi.org/10.1186/s12885-023-10944-w (2023).
Article CAS PubMed PubMed Central Google Scholar
Chen, Y., Wang, Z., Deng, Q., Chen, Y. & Liang, H. Mitochondrial phosphoenolpyruvate Carboxykinase inhibits kidney renal clear cell carcinoma malignant progression, leading to cell energy metabolism imbalance. Am. J. Cancer Res. 13, 886–899 (2023).
CAS PubMed PubMed Central Google Scholar
Nie, X. et al. ST14 interacts with TMEFF1 and is a predictor of poor prognosis in ovarian cancer. BMC Cancer. 24, 330. https://doi.org/10.1186/s12885-024-11958-8 (2024).
Article CAS PubMed PubMed Central Google Scholar
Dai, Y. H. et al. Gene-associated methylation status of ST14 as a predictor of survival and hormone receptor positivity in breast Cancer. BMC Cancer. 21, 945. https://doi.org/10.1186/s12885-021-08645-3 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kim, K. Y. et al. Targeting metastatic breast cancer with peptide epitopes derived from autocatalytic loop of Prss14/ST14 membrane Serine protease and with monoclonal antibodies. J. Exp. Clin. Cancer Res. 38, 363. https://doi.org/10.1186/s13046-019-1373-y (2019).
Article CAS PubMed PubMed Central Google Scholar
Ren, C., Li, J., Zhou, Y., Zhang, S. & Wang, Q. Typical tumor immune microenvironment status determine prognosis in lung adenocarcinoma. Transl. Oncol. 18, 101367. https://doi.org/10.1016/j.tranon.2022.101367 (2022).
Article CAS PubMed PubMed Central Google Scholar
Yang, M., Liu, J. W., Zhang, Y. T. & Wu, G. The role of renal macrophage, AIM, and TGF-β1 expression in renal fibrosis progression in IgAN patients. Front. Immunol. 12, 646650. https://doi.org/10.3389/fimmu.2021.646650 (2021).
Article CAS PubMed PubMed Central Google Scholar
Jaggi, U. et al. Increased phagocytosis in the presence of enhanced M2-like macrophage responses correlates with increased primary and latent HSV-1 infection. PLoS Pathog. 16, e1008971. https://doi.org/10.1371/journal.ppat.1008971 (2020).
Article CAS PubMed PubMed Central Google Scholar
Yan, K. et al. Multi-omics analysis identifies FoxO1 as a regulator of macrophage function through metabolic reprogramming. Cell. Death Dis. 11, 800. https://doi.org/10.1038/s41419-020-02982-0 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zhou, P. et al. Quercetin-solid lipid nanoparticle-embedded hyaluronic acid functionalized hydrogel for Immunomodulation to promote bone reconstruction. Regen Biomater. 10, rbad025. https://doi.org/10.1093/rb/rbad025 (2023).
Article CAS PubMed PubMed Central Google Scholar
Pritchard, A. et al. Lung tumor Cell-Derived exosomes promote M2 macrophage polarization. Cells. 9 https://doi.org/10.3390/cells9051303 (2020).
Zekri, J., Mansour, M. & Karim, S. M. The anti-tumour effects of Zoledronic acid. J. Bone Oncol. 3, 25–35. https://doi.org/10.1016/j.jbo.2013.12.001 (2014).
Article PubMed PubMed Central Google Scholar
Pei, L. et al. Roles of cancer-associated fibroblasts (CAFs) in anti- PD-1/PD-L1 immunotherapy for solid cancers. Mol. Cancer. 22, 29. https://doi.org/10.1186/s12943-023-01731-z (2023).
Article PubMed PubMed Central Google Scholar
Zhang, T. et al. RS1 gene is a novel prognostic biomarker for lung adenocarcinoma. Thorac. Cancer. 13, 1850–1861. https://doi.org/10.1111/1759-7714.14471 (2022).
Article CAS PubMed PubMed Central Google Scholar
Lampis, A. et al. Challenges and perspectives for immunotherapy in oesophageal cancer: A look to the future (Review). Int. J. Mol. Med. 47 https://doi.org/10.3892/ijmm.2021.4930 (2021).
Wu, Q. et al. Cancer-associated adipocytes as immunomodulators in cancer. Biomark. Res. 9, 2. https://doi.org/10.1186/s40364-020-00257-6 (2021).
Article PubMed PubMed Central Google Scholar
Słodka, A., Wiktorska, M. & Brzezińska-Błaszczyk, E. IgE by itself affects mature rat mast cell preformed and de novo-synthesized mediator release and amplifies mast cell migratory response. PLoS One. 8, e79286. https://doi.org/10.1371/journal.pone.0079286 (2013).
Article CAS PubMed PubMed Central Google Scholar
Yang, H., Han, X. & Hao, Z. An immune-gene-based classifier predicts prognosis in patients with cervical squamous cell carcinoma. Front. Mol. Biosci. 8, 679474. https://doi.org/10.3389/fmolb.2021.679474 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ebrahem, Q. et al. Increased neovascularization in mice lacking tissue inhibitor of metalloproteinases-3. Investig. Ophthalmol. Vis. Sci. 52, 6117–6123. https://doi.org/10.1167/iovs.10-5899 (2011).
Article CAS Google Scholar
Nonomura, N. et al. Decreased number of mast cells infiltrating into needle biopsy specimens leads to a better prognosis of prostate cancer. Br. J. Cancer. 97, 952–956. https://doi.org/10.1038/sj.bjc.6603962 (2007).
Article CAS PubMed PubMed Central Google Scholar
Yu, Y. et al. A signature of immune-related gene pairs (IRGPs) for risk stratification and prognosis of oral cancer patients. World J. Surg. Oncol. 20, 227. https://doi.org/10.1186/s12957-022-02630-1 (2022).
Article PubMed PubMed Central Google Scholar
Tie, Y., Tang, F., Wei, Y. Q. & Wei, X. W. Immunosuppressive cells in cancer: mechanisms and potential therapeutic targets. J. Hematol. Oncol. 15, 61. https://doi.org/10.1186/s13045-022-01282-8 (2022).
Article CAS PubMed PubMed Central Google Scholar
Bo, X. et al. High infiltration of mast cells is associated with improved response to adjuvant chemotherapy in gallbladder cancer. Cancer Sci. 111, 817–825. https://doi.org/10.1111/cas.14302 (2020).
Article CAS PubMed PubMed Central Google Scholar
Shan, L. et al. Identification of five m6A-Related LncRNA genes as prognostic markers for endometrial Cancer based on TCGA database. J. Immunol. Res.. 2022 (2547029). https://doi.org/10.1155/2022/2547029 (2022).
Qiu, Y. et al. Activated T cell-derived Exosomal PD-1 attenuates PD-L1-induced immune dysfunction in triple-negative breast cancer. Oncogene. 40, 4992–5001. https://doi.org/10.1038/s41388-021-01896-1 (2021).
Article CAS PubMed PubMed Central Google Scholar
Houssiau, H. & Seront, E. Improving the role of immune checkpoint inhibitors in the management of advanced urothelial carcinoma, where do we stand? Transl. Oncol. 19, 101387. https://doi.org/10.1016/j.tranon.2022.101387 (2022).
Article CAS PubMed PubMed Central Google Scholar
Tang, Y., Hu, Y., Niu, Y., Sun, L. & Guo, L. CCL5 as a prognostic marker for survival and an Indicator for immune checkpoint therapies in small cell lung Cancer. Front. Med. (Lausanne). 9, 834725. https://doi.org/10.3389/fmed.2022.834725 (2022).
Article PubMed Google Scholar
Tuo, B. et al. Roles of exosomal circrnas in tumour immunity and cancer progression. Cell. Death Dis. 13, 539. https://doi.org/10.1038/s41419-022-04949-9 (2022).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We extend our gratitude to GEO and TCGA database, and all contributors who generously shared their data on these platforms.

Funding

This research was funded by the Clinical Medical Research 4310 Program of the University of South China (20224310NFYCG07) and Scientific Research Fund Project of Hunan Provincial Health Commission (B202304029220).

Author information

Sixuan Wu contributed equally to this work.

Authors and Affiliations

Department of Oncology, The First Affiliated Hospital, Hengyang Medical School, University of South China, No. 69 Chuanshan Road, Hengyang, 421001, Hunan, China
Sixuan Wu, Yuanbin Tang, Yeru Tan & Yuehua Li
Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, 350014, China
Sixuan Wu, Qihong Pan, Yaqin Zheng & Junfan Pan

Authors

Sixuan Wu
View author publications
Search author on:PubMed Google Scholar
Yuanbin Tang
View author publications
Search author on:PubMed Google Scholar
Qihong Pan
View author publications
Search author on:PubMed Google Scholar
Yaqin Zheng
View author publications
Search author on:PubMed Google Scholar
Yeru Tan
View author publications
Search author on:PubMed Google Scholar
Junfan Pan
View author publications
Search author on:PubMed Google Scholar
Yuehua Li
View author publications
Search author on:PubMed Google Scholar

Contributions

The study was conceived and designed by Y.L. and S.W. J.P. and Y.T. analyzed the data. The manuscript was written by S.W., Y.T., and Q.P. The experiment was conducted by J.P. and S.W. The manuscript was revised by Y.L., Y.T., and J.P. The study was supervised by Y.Z. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Yeru Tan, Junfan Pan or Yuehua Li.

Ethics declarations

Ethics approval and consent to participate

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and national research committee and with the 1964 Helsinki declaration and its later amendments. The project was approved by the institutional ethics committee of the Fujian Cancer Hospital. Informed consent was obtained from all participants included in the study.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary Material 2

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wu, S., Tang, Y., Pan, Q. et al. Unveiling diagnostic biomarkers and therapeutic targets in lung adenocarcinoma using bioinformatics and experimental validation. Sci Rep 15, 22893 (2025). https://doi.org/10.1038/s41598-025-05227-2

Download citation

Received: 07 February 2025
Accepted: 02 June 2025
Published: 02 July 2025
DOI: https://doi.org/10.1038/s41598-025-05227-2