Introduction

Lung cancer is considered to be the most common cancer worldwide and is correlated with the prevalence of tobacco use1,2. In the past few decades, the incidence of lung cancer in China has been increasing3.The proportion of non-small cell lung cancer (NSCLC) among all types of lung cancer is 85%4, and lung adenocarcinoma (LUAD) is the most common type of NSCLC5. Recently, research has shown the high heterogeneity and complexity of LUAD, which is not reflected in the traditional histological classification of LUAD. These findings show that tumour classification requires reliable prognostically relevant factors to select more effective treatment strategies for patients with lung cancer. Lactic acid provides energy for lung tumour cells, more so than glucose does6. Therefore, the lactate pathway may become a novel LUAD treatment target.

Going back to 1930, Otto Warburg proved that cancer cells prefer glycolysis even when their oxygen content is normal; this phenomenon is known as the “Warburg effect”7. Lactate, the reduction product of pyruvate via lactate dehydrogenase (LDH), is the final product of glycolysis8. In recent years, the role of lactate in tumours has been gradually discovered; for example, lactate is the tricarboxylic acid (TCA) cycle carbon source of NSCLC and provides energy for tumour cells6, Bergers G and Fendt SM reported that lactate directly promotes the invasion and migration of cancer cells9 and Karin Fischer at el. reported that a high concentration of lactate in the TME blocks the output of lactate in T cells, thus disrupting their metabolism and function10. These studies show that lactate metabolism has the potential to become a new target for tumor therapy11.

Many genes have been confirmed to regulate lactate metabolism12,13,14,15,16. Although research related to lactate metabolism has become a hot topic in cancer treatment research, the correlation between the prognosis of lung adenocarcinoma (LUAD) in patients and the expression of lactate-related genes remains unclear. Therefore, to elucidate the heterogeneity of LUAD, identifying the molecular characteristics of lactate-related genes may aid in the early diagnosis and prevention of LUAD and provide clinical insights for the personalized treatment of individual LUAD patients.

In this study, 701 LUAD samples from the GEO cohort and the TCGA cohort were classified into phenotypes on the basis of 36 genes related to lactate metabolism. LASSO and Cox regression analyses were used to construct a new score model to quantify the lactate-related patterns of individual patients. Furthermore, the lactate-related signature demonstrated good predictive ability for immune infiltrating cells, tumour mutation burden, and response to immunotherapy. The validity of the 16-LRG risk signature was subsequently verified in the GEO cohort GSE50081. These results indicate that lactation modification plays a nonnegligible role in shaping the characteristics of individual tumour microenvironments. Therefore, we established a scoring system to quantify the lactation modification patterns of individual patients, aiming to address early diagnosis and prognosis prediction in patients with LUAD.

Materials and methods

Acquisition of LUAD dataset

We downloaded three LUAD datasets: the TCGA-LUAD, GSE30219, and GSE37745 datasets. We removed LUAD samples with missing clinical data (such as stage and survival time), ultimately yielding 701 LUAD samples.

Acquisition of LRGs

We accessed the MSigDB website (https://www.gsea-msigdb.org) and searched for the specific subset using “lactate” as the keyword. We then downloaded the retrieved specific subset and ultimately obtained 285 LRGs.

Consensus clustering of PLRGs

The “limma” R software package was used for survival analysis, and 36 genes were identified as prognostic lactate-related genes (PLRGs) of LUAD, with P < 0.01. By employing the Euclidean distance metric and the partition around median (PAM) clustering algorithm, the 'Consensus Cluster Plus’ R package was used for cluster analysis, and 1000 cycles were conducted to ensure the reliability of the classification results. We chose the case with the best clustering stability between k values of 2 and 9, where k = 3.

Application of GSVA and ssGSEA

GSVA was used to analyse the differences in gene set enrichment among the three lactate-related clusters. The gene set “c2.cp.kegg.v7.4.symbols.gmt” from the MSigDB database (https://www.gsea-msigdb.org) was obtained to conduct GSEA. As an extension of the GSEA method, ssGSEA was applied to calculate the degree of infiltration of 23 kinds of immune cells in each sample with the TIMER, CIBERSORT, CIBERSORT–ABS, QUANTISEQ, MCPCOUNTER, XCELL and EPIC methods. Using the ‘GSEABase’ and ‘GSVA’ R packages, differential expression-related pathways and immune cells were identified17.

DEGs

DEGs among LUAD patients in the three LA modification patterns were identified with adjusted P < 0.0001 and log2|FC|> 1 by Bonferroni method18,19. GO annotation and KEGG pathway enrichment analyses were performed on the genes identified by the ‘clusterProfiler’ R package20,21,22.

Construction and validation of the lactate-related signature

Univariate Cox analysis was performed to screen PLRGs. LASSO was used to construct a sixteen-gene prognostic signature based on the screened LRGs. The risk score was determined with the formula: risk score = ∑(gene × coefficient). To verify the 16-LRG risk signature, an external dataset, GSE50081, was downloaded.

Statistical analysis

R 4.2.0 is used for data processing.

Results

Screening and genetic variation profile of the prognostic lactate-related genes in LUAD

First, a total of 285 genes were downloaded from the MSigDB, and 36 genes were identified as prognostic lactate-related genes (PLRGs) of LUAD, with P < 0.01 (Table 1). Prevalent CNV changes were detected in the PLRGs. Among them, SLC25A4, SLC13A5 and C1QBP had more CNV deletions. However, the CNV amplification of NDUFS6, COX6B1 and MRPL12 was greater (Fig. 1A). The somatic mutations and copy numbers of 36 lactate-related genes in LUAD were analysed. Only 75 (13.37%) of the 561 samples had lactate-related gene mutations, and the overall mutation frequency was low (Fig. 1B). The location of CNV alterations on the chromosome for PLRGs were determined (Fig. 1C). Finally, we analysed the differences in the mRNA expression of selected genes between tumour and normal samples (Fig. 1D).

Table 1 36 prognostic lactate-related genes (PLRGs) of LUAD.
Fig. 1
figure 1

Genetic variation of PLRGs. (A) The CNV variation frequency of PLRGs. (B) Mutation frequency of PLRGs. (C) Location of the CNV alteration of PLRGs on chromosomes. (D) Boxplot of PLRGs expression levels.

Three phenotypes were identified by prognostic lactate-related genes

The correlations between PLRGs are shown in a network diagram (Fig. 2A). For further analysis, we applied a consensus clustering algorithm to classify LUAD samples into subgroups on the basis of the 36 PLRGs. When k = 2−9, k = 3 indicates the best stability; thus, LUAD patients were classified into three different lactate-related molecular clusters (LA clusters) (Fig. 2B–D). Principal component analysis (PCA) revealed three different distribution characteristics, which indicated significant consensus clustering results (Fig. 2E). Among LAcluster A (n = 180), LAcluster B (n = 291), and LAcluster C (n = 230), LUAD cases in LAcluster B had the best prognosis, and LUAD cases in LAcluster C had the worst prognosis (Fig. 2F). Different clinicopathological characteristics of individual PLRGs are shown in a heatmap (Fig. 2G).

Fig. 2
figure 2

The construction of LAclusters. (A) Correlations and prognosis of the 36 PLRGs in patients with LUAD. (B) Unsupervised clustering of the 36 PLRGs and optimal consensus matrices for k = 3. (C) Consensus clustering CDF for k = 2–9 (D) The CDF curve of consensus clustering for k = 2–9. (E) PCA of three LAclusters. (F) The overall survival of three LAclusters by KM curves. (G) Heatmaps of the distribution of 36 PLRGs in the three LAclusters. (H) GSEA analysis in the three LAclusters. (I) Degree of immune cells infiltration of three LAclusters.

Next, we performed GSVA enrichment analysis. Compared with LAcluster A, LAcluster B was more enriched in immune activation, including the JAK–STAT and Fc epsilon RI signalling pathways23,24. Moreover, the enriched pathways in LAcluster A were closely related to cellular mechanisms, and LAcluster C was predominantly associated with tumour metabolism, such as the P53 signalling pathway and purine metabolism (Fig. 2H). Finally, using ssGSEA, the enrichment grades of 23 kinds of immune cells in 3 LAclusters were calculated. As shown in Fig. 2I, 13 kinds of immune cells had increased degrees of immune infiltration in LAcluster B, which again proved that LAcluster B was associated with immune activation.

To further investigate the latent biological function of the lactate-related subtypes, we analysed the differences in LA-related gene expression among the three subtypes and identified 1208 overlapping LA phenotype-related DEGs (Supplementary Fig. 1A). Supplementary Table 1 shows the list of DEGs. GO and KEGG analyses were subsequently performed, and the significantly enriched biological processes are shown in Supplementary Fig. 1B–E.

Establishment of a sixteen-gene prognostic signature

Cox regression analysis revealed 36 prognosis-related genes (Fig. 3A). Next, LASSO Cox regression was used to identify key genes, and the coefficient of genes was computed (Fig. 3B, C). Sixteen genes, including six protective genes (SLC25A42, HAGH, LDHD, MLYCD, ISCU and ACAT1) and ten risk genes (LDHA, NDUFA12, SLC3A1, ETHE1, MRPL44, PYGL, MDH2, DARS2, HPDL and NDUFA8), were selected to construct the prognostic signature. The coefficients of the sixteen genes are listed in Table 2. The risk score of each sample was determined by the formula: risk score = ∑(gene × coefficient). The samples were subsequently divided into a high-risk group (HRG) and a low-risk group (LRG) according to the median risk score. The distribution of risk scores and sample survival status are shown in Fig. 3D. The heatmaps show the expression of 16 genes in the two risk groups (Fig. 3E). In addition, survival analysis revealed that compared with LRG samples, HRG samples from LUAD patients were associated with poorer prognosis and worse survival (Fig. 3F). Cox univariate (Fig. 3G) and multivariate regression (Fig. 3H) analyses revealed that the 16-LRG risk signature was related to the overall survival rate of LUAD patients and could predict the prognosis of LUAD in patients.

Fig. 3
figure 3

Establishing the 16-LRG risk signature and analysis of independent prognostic factor. (A) Forest plots showing 36 prognostic LRGs based on Cox univariate regression analysis. (B, C) Cvfit and lambda curves of LASSO regression by minimum criteria. (D) Distribution of the risk scores and survival status in high- and low-risk groups. (E) Distribution of the 16 genes in high- and low-risk groups. (F) The overall survival of two risk groups by KM curves. Univariate Cox regression analysis (G) and multivariate Cox regression analysis (H) of clinical features regarding the 16-LRG risk signature.

Table 2 The coefficients of the 16 key genes determined by Lasso Cox regression.

Assessment of the correlation between the prognostic signature and clinical features

A ROC curve was used to evaluate the predictive value of the prognostic models compared with that of common cancer pathological features. The AUC of the 16-LRG risk signature was 0.692 (Fig. 4A). The areas under the curve were 0.692, 0.684 and 0.697 for the 1-year, 3-year and 5-year curves, respectively (Fig. 4B). A nomogram was constructed to assess the intrinsic value of the 16-LRG risk signature. (Fig. 4C). Calibration plots for 1-year, 3-year and 5-year OS were constructed to visualize the accuracy of the nomograms (Fig. 4D). We constructed a heatmap of the different clinicopathological characteristics associated with individual genes (Fig. 4E). The attribute changes in individual LUAD samples were visualized by an alluvial diagram (Fig. 4F).

Fig. 4
figure 4

Correlation between clinical information and the 16-LRG risk signature. (A) ROC curves of clinical features and risk scores. (B) ROC curves of the 16-LRG risk signature forecasting 1-, 2-, and 3-year OS. (C) Nomogram of the 16-LRG risk signature, age, gender, and TNM stage. (D) Calibration curves for internal validation of the nomogram according to 1-, 3-, and 5-year OS rate of LUAD samples. (E) The distribution of different clinicopathological characteristics in the 16 LRGs. (F) The alluvial diagram of the transformations of LAcluster, 16-LRG risk signature (riskscore), and survival status (Fustat).

We further analysed the OS of patients with different clinicopathological features in the high- and low-risk score groups. The samples were stratified by patient age into an older group (> 65) and a younger group (≤ 65), and the high-risk score group had worse survival than the low-risk score group did regardless of age (Supplementary Fig. 2A, B). As shown in Supplementary Fig. 2C and D, the outcomes of both sexes could be predicted with the 16-LRG risk signature. LUAD patients in the high-risk score group had worse OS in all stages (N0, N1–3, T1–T2 and T3–T4) (Supplementary Fig. 2E–H). To summarize, stratified analysis revealed that the 16-LRG risk signature had prognostic value regardless of age, sex, T stage and N stage.

GSEA, GO and KEGG analysis

The GSEA results revealed that tumours in the HRG were more enriched in tumour-promoting pathways and cellular mechanisms, such as the p53 signalling pathway, purine metabolism, the cell cycle, homologous recombination and DNA replication. In addition, the metabolic pathways associated with LRG, such as taurine and hypo taurine metabolism and primary bile acid biosynthesis, were more enriched (Fig. 5A). DEGs were defined as those whose log2|FC| was > 1 and whose FDR was < 0.05 (Fig. 5B). GO and KEGG enrichment analyses were then performed on the DEGs (Fig. 5C–F). Overall, the results revealed that the 16-LRG risk signature was associated with tumour and biological metabolism.

Fig. 5
figure 5

GSEA, GO and KEGG analysis. (A) GSEA analysis of the high- and low-risk groups. (B) Volcano plot of DEGs between high- and low-risk groups. (C, D) GO analysis by DEGs. (E, F) KEGG analysis indicating pathways were significant enriched.

Immune microenvironment analysis of 16-LRG risk signature

Firstly, using different methods, we explored the immune microenvironment of two risk groups (Fig. 6A). The link between the prognostic signature and TME-infiltrating immune cells was analysed (Fig. 6B). Among the 23 kinds of common immune cells, 14 differed significantly between the HRG and LRG (Fig. 6C). Interestingly, immune cells with tumour suppressive effects, such as activated B cells, eosinophils, T-follicular helper cells and plasmacytoid dendritic cells, were more highly infiltrated in tumours in the LRG25,26,27. Through immune function analysis, APC coinhibition, CCR, cytolytic activity, MHC class-I, T-cell coinhibition and parainflammation were shown to be activated in the HRG. HLA and type-II IFN responses were more strongly activated in the LRG (Fig. 6D). To determine whether patients in the HRG had better immunotherapy responses, the differences in immune target expression between the two risk groups were analysed (Fig. 6E). Effective immunotherapeutic targets such as PDCD1 (PD-1), CD274 (PD-L1) and IDO1 may provide better therapeutic effects for HRG patients.

Fig. 6
figure 6

Immune-related analysis of high-risk group and low-risk group. (A) Heatmap of immune microenvironment in high- and low-risk group. (B) Connection of TME-infiltrating cells and the 16-LRG risk signature. (C) Boxplot of immune cells infiltration. (D) Immune function analysis. (E) Immune checkpoints analyses between LRG and HRG.

Tumor mutation burden of the 16-LRG risk signature

Because the TMB is associated with the efficacy of immunotherapy, cancer-related gene mutation data were utilized to assess the TMB levels in the two risk groups. First, the top 15 genes with the highest mutation frequency are shown in Fig. 7A and B by waterfall plots. We defaulted to plotting the TMB data transformed by log10 and calculated the TMB scores for both groups. Compared with the LRG, the HRG had higher TMB (Fig. 7C). High TMB indicates a poor prognosis in various cancers. In our study, the OS of the high-risk group was poor, and the results were consistent with those of other studies. As shown in Fig. 7D, low TMB is associated with a worse prognosis in patients with LUAD. Interestingly, low-risk patients with different TMBs had better OS than high-risk patients with different TMBs did (Fig. 7E). In summary, compared with the TMB, the 16-LRG risk signature was more reliable for predicting the prognosis of LUAD.

Fig. 7
figure 7

Tumor mutation-related analyses. (A, B) Waterfall plots indicating TMB genes in HRG (A) and LRG (B). (C) TMB between the two risk groups. (D) Overall survival of high-TMB and low-TMB groups. (E) Overall survival of four groups.

Verification of the 16-LRG risk signature with an independent LUAD dataset

The GEO dataset GSE50081 was downloaded for independent external validation. The risk scores of the GSE50081 patients were calculated by the same formula. By the median risk score, the GSE50081 patients were separated into the HRG and the LRG. The distributions of patient risk scores and survival status are shown in Fig. 8A. As expected, the K‒M survival curve revealed that patients in the HRG had worse overall survival (Fig. 8B). The expression of 16 LRGs in GSE50081 patients was drawn as a heatmap (Fig. 8C). As shown by the ROC curve, the AUC of this risk score model was 0.721, which was greater than that of the other common clinical features (Fig. 8D). Time-dependent ROC curves were constructed, and the AUCs at 1, 2, and 3 years were 0.721, 0.623, and 0.657, respectively (Fig. 8E). Finally, we conducted expression and survival analyses on the genes used to establish the model (Supplementary Fig. 3A,B).The Results confirmed the reliability of these genes.

Fig. 8
figure 8

Verifying of the 16-LRG risk signature by independent LUAD dataset. (A) The risk-score distribution and survival status. (B) Survival curves of two risk groups. (C) Expression of the 16 genes in two risk groups. (D) ROC curves of clinical features and risk scores. (E) ROC curves of the 16-LRG risk signature.

Discussion

LUAD, a highly heterogeneous tumour with a low overall survival rate, is the primary subtype of NSCLC28. Although we have made some progress in the chemotherapy and immunotherapy of LUAD, the overall treatment results are still not satisfactory. Owing to the complexity of the TME and strong heterogeneity of LUAD, many patients exhibit drug resistance and insensitivity to treatment29. Therefore, identifying a new LUAD classification to contribute to better patient treatment strategies is urgent. In recent years, increasing evidence has confirmed the role of lactic acid in promoting processes such as migration, immune escape and proliferation in cancer30. High lactate levels have the connection with increased metastasis, tumour recurrence and poor result11. These findings suggest that lactic acid can become a tumour treatment target. The role of lactic acid in lung cancer is receiving increasing attention. Glucose uptake and increased glycolytic activity are metabolic characteristics of lung cancer cells31. Previous studies have proven that human lung tumours may use lactate as a carbon source32. In terms of lung cancer immunity, lactate from tumours inactivates immune cell proliferation and effector function by acidifying the tumour microenvironment and upregulating PD-L1 expression10,33. According to the recent literature, increased extracellular lactate levels through the Notch1/TAZ axis inhibit cytotoxic T-cell activity, leading to the invasive characteristics of lung cancer cells34. Therefore, research on the mechanism through which lactate promotes immunotherapy efficacy in LUAD is important. However, the relationship between LUAD and the comprehensive role of lactate-related genes is not well defined. Therefore, in this study, we explored risk signatures for predicting prognosis in LUAD patients and the immune status of their TME. Ultimately, a 16-LRG risk signature was constructed to predict the prognosis and effect of immunotherapy of LUAD in patients.

In this study, on the basis of the expression of the 36 PLRGs, LUAD patients were divided into three subtypes. Using GSVA enrichment analysis, we further analysed the reasons for the differences among the three subtypes, and it was obvious from the findings that the enriched pathways associated with immune activation in LAcluster B and LAcluster A were significantly enriched in cellular mechanisms and that LAcluster C was predominantly related to tumour mechanisms. Finally, the correlation between the three lactate-related subtypes and TME cell infiltration was assessed.

Next, we identified 16 LRGs among the 36 PLRGs to construct a risk signature by LASSO and Cox regression analyses. LDHA catalyses the conversion of pyruvate to lactate, fuelling histone lactylation, which promotes gene expression linked to cancer progression and immune evasion35. As a component of mitochondrial complex I, NDUFA12 dysfunction can increase lactate production, indirectly influencing the lactylation landscape that supports tumour growth36. SLC3A1 is part of a cysteine/glutamate transporter, influencing redox balance and potentially affecting the metabolic shift to lactate generation in cancer cells37. ETHE1 mutations cause sulfide toxicity that inhibits mitochondrial respiration, potentially diverting metabolism towards aerobic glycolysis and lactate production in tumours38. As a mitochondrial ribosomal protein, MRPL44 is essential for oxidative phosphorylation; its deficiency can promote a glycolytic switch and increase the availability of lactate for lactylation39. PYGL encodes glycogen phosphorylase, which breaks down glycogen to provide glucose for glycolysis, thereby supplying the pyruvate and lactate necessary for lactylation in cancer40. HAGH degrades toxic glycolytic byproducts such as methylglyoxal, protecting cancer cells from damage and ensuring the continued glycolytic flux that generates a lactylation substrate41. LDHD oxidizes lactate back to pyruvate, potentially regulating intracellular lactate levels and thereby modulating the pool available for histone lactylation in cancer cells42. MDH2 is a key TCA cycle enzyme that maintains metabolic homeostasis; its dysregulation can disrupt mitochondrial metabolism and promote a lactogenic environment in tumours43. MLYCD regulates fatty acid metabolism, and its loss can lead to metabolic rewiring, which enhances the dependence on glycolysis and lactate generation in cancer44. ISCU is essential for iron‒sulfur cluster biogenesis; ISCU deficiency cripples mitochondrial respiration, forcing cancer cells to rely on glycolysis and lactate production45. ACAT1 catalyses a key step in ketone body and pyruvate metabolism, and its activity can influence the acetyl-CoA/lactyl-CoA balance, potentially by competing with or influencing lactylation46. As another core subunit of mitochondrial complex I, loss of NDUFA8 impairs oxidative phosphorylation, driving the glycolytic shift that increases lactate levels and lactylation in cancer47.

The 16-LRG risk signature classified patients into LRG and HRG groups on the basis of the median risk score, and patients in the HRG group had worse survival than those in the LRG group did. Moreover, univariate and multivariate Cox regression analyses suggested that the 16-LRG risk signature was an independent prognostic marker for LUAD. ROC curves and nomograms have been widely applied to predict the prognosis of cancer in patients48,49. ROC analysis was used to visualize time-associated results in patients with LUAD. Moreover, a nomogram was constructed by combining the risk score with clinical factors.

Different immune cells play different roles in the tumour microenvironment. For example, the interaction and activation between B cells and macrophages are considered key humoral immune effects in inhibiting tumour progression50. It has been shown that the extensive infiltration of eosinophils in tumours is related to the long-term survival of patients25. Neutrophils were found to have key functions in carcinogenesis51,52. Studies have confirmed that neutrophils antagonize metastasis in lung cancer53. The infiltration of T-follicular helper cells indicates that these cells are involved in the antitumour immunity of NSCLC26. In this study, more activated B cells, eosinophils, immature dendritic cells, mast cells, monocytes, natural killer cells, plasmacytoid dendritic cells, and T follicular helper cells were present in the TME in the HRG than in the LRG, and most of these cells had positive effects on tumour immunity, which explains why LRG patients have better OS. These findings may help us to understand the relationships among lactate levels, TME cell infiltration and LUAD. Many studies have confirmed the link between lactate levels and immunosuppression. Lactate helps to establish an immunosuppressive environment conducive to cancer cell growth and immune escape54.

This study has several limitations. We did not conduct sufficient in-depth research on the 16 lactate-related genes (LRGs) selected, and their detailed mechanisms of action still need further exploration. Future research should focus on elucidating the detailed mechanisms of these genes in tumour progression and immune response and further verifying their effectiveness and mechanisms as tumour biomarkers and therapeutic targets through more in vitro and in vivo experiments. Moreover, this prognostic model requires more clinical patient information to verify its feasibility for clinical application.

Conclusion

Three lactate-related molecular subtypes with different prognoses were found in LUAD. Sixteen LRGs were identified, and relevant models were constructed. Both the ROC curve and the nomogram confirmed the prognostic value of the 16-LRG risk signature, which can serve as an independent prognostic factor for LUAD in patients. Considering the heterogeneity of LUAD, the 16-LRG risk signature enhances our characterization of TME cell infiltration and suggests more effective immunotherapy strategies.