Introduction

Non-small cell lung cancer (NSCLC) is a leading cause of cancer-related mortality globally, linked to multiple factors such as smoking, environmental pollution, and genetic mutations1. The high heterogeneity of NSCLC contributes to its aggressive behavior, leading to a persistently low five-year survival rate among advanced-stage patients2. Despite advancements in treatments-including surgery, chemotherapy, radiotherapy, targeted therapy, and immunotherapy (e.g., PD-1/PD-L1 inhibitors)-the overall prognosis for NSCLC patients remains poor3,4,5. In advanced stages, immunotherapy outcomes vary widely, with some patients showing no response or rapidly developing resistance6. The challenging nature of treating NSCLC is related to various factors, among which the metabolic reprogramming of the tumor microenvironment, particularly the abnormal activation of glycolysis, plays a crucial role in impairing immune recognition and clearance of tumors, contributing significantly to immune evasion7,8,9.

Glycolysis is a central feature of metabolic reprogramming in tumor cells. Even in the presence of oxygen, tumor cells preferentially rely on glycolysis for energy production, a phenomenon known as the “Warburg effect”7. In NSCLC, increased glycolysis not only facilitates tumor cell migration and metastasis but also alters the tumor microenvironment by producing high levels of lactate, which supports tumor survival10. These metabolic byproducts suppress the antitumor activities of CD8 + T cells and natural killer (NK) cells while promoting the recruitment of immunosuppressive cells such as regulatory T cells (Tregs) and myeloid-derived suppressor cells (MDSCs), resulting in immune evasion11,12. CD8 + T cells are vital in antitumor immunity, directly killing tumor cells through perforin and granzymes. However, studies indicate that intratumoral CD8 + T cells are highly heterogeneous, with significant functional differences among subpopulations13,14; some may lose effector function or shift to an inhibitory phenotype within the tumor microenvironment. This heterogeneity results in diverse roles of CD8 + T cells in various tumor microenvironments, ranging from tumor suppression to ineffectiveness or exploitation by the tumor. While the critical role of CD8 + T cells in antitumor immunity is well established, the interplay among their subpopulations and the effects of their metabolic reprogramming on antitumor functions remain underexplored. These gaps in understanding not only limit our comprehensive knowledge of CD8 + T cell roles but also highlight important areas for developing novel therapeutic strategies.

We collected nine single-cell datasets and constructed CD8 T-cell single-cell atlases from 89 patients, revealing a highly diverse transcriptional landscape of CD8 T cells encompassing 10 distinct states. Effector CD8 T cells expressing GZMK were predominant, alongside exhausted T cells with inhibitory checkpoint expression and stressed T cells expressing heat shock proteins. Notably, we identified a subset of LTB + LDHA + CD8 T cells with distinct immune phenotypes and metabolic profiles that promote tumor metastasis and impair CD8 T-cell cytotoxicity. Using multimodal data analysis-including bulk transcriptomics, single-cell transcriptomics, spatial transcriptomics, and proteomics-we explored the potential mechanisms involving LDHA. These findings indicate promising avenues for targeting glycolysis and developing innovative immunotherapies for NSCLC within the realm of precision medicine.

Materials and methods

scRNA-seq cohorts

We collected nine datasets from the 10x Genomics platform (see Supplementary Table 1). The cohorts were analyzed using the “Seurat” R package. Cell annotations were completed, and CD8T cell data were extracted from each cohort. Differentially expressed genes were identified with the “FindMarkers” program, applying criteria of log2FC > 0 and an adjusted p-value of 0.05. Differential gene analysis was performed for primary and metastatic individuals, followed by cell communication analysis using the “cellchat” R package.

RNA-seq cohorts

In addition to the scRNA-seq cohort, transcriptomic data and clinical information were available for three RNA-seq cohort samples (TCGA_LUAD, GSE37745, GSE29013, GSE117570). Collection IDs were converted to gene symbols using the “clusterProfiler” R package, and data integration was conducted with the “sva” R package. Model training and validation were carried out with the “Mime” R package, using the differential genes identified earlier.

Spatial transcriptome analysis

Rigorous quality control measures were applied to the single-cell transcriptomics data, focusing on the number of expressed genes, unique molecular identifiers (UMIs), and the percentage of mitochondrial RNA in each cell. A signature score matrix was constructed by averaging the expression levels of the top 25 specific genes for each cell type from the scRNA-seq reference dataset. An enrichment score matrix was generated using the get_enrichment_matrix and enrichment_analysis functions from the Cottrazm package. Visualization was performed using the SpatialDimPlot function from the Seurat package. Spearman correlation analysis was used to assess correlations between cell content across all spots and between cell content and gene expression levels, with results visualized using the linkET package.

Exploration of related pathways

The CancerSEA database classified the functional states of tumor cells. The z-score algorithm proposed by Lee et al. was used to reflect pathway activity by integrating the expression of characteristic genes. The R package GSVA was used to calculate the combined z-scores for functional state gene sets, which were then standardized to obtain gene set scores. Pearson correlations between genes and each gene set score were calculated15. RBN RPPA data were median-centered and normalized by the standard deviation across all samples for each component to determine relative protein levels. Pathway scores were then calculated by summing the relative protein levels of positive regulatory components and subtracting those of negative regulatory components within each pathway16. The easier package utilized PROGENy to calculate scores for 12 pathways: EGFR, estrogen, hypoxia, JAK-STAT, MAPK, NF-κB, p53, PI3K, TGF-β, TNF-α, VEGF, and WNT. Pathway-specific characteristics were obtained from pathway-perturbation experiments that assessed changes in gene expression upon pathway disruption. A linear regression model was employed to fit genes affected by pathway perturbation. Pathway-specific characteristics and gene expression data were then used to infer pathway signaling activity. The easier package directly computed pathway fractions based on the “PROGENy” R package.

Functional enrichment analysis

KEGG17,18 enrichment analysis was also carried out with the clusterProfiler package. Kaplan-Meier survival analysis was conducted with the survival package. Spearman correlation analysis was also used to calculate the correlation between gene expression and dose-response curve (area under the curve - AUC) values in the CTRP and PRISM databases.

Basic experimental verification

CD8 + T lymphocytes were isolated from peripheral blood by negative selection using the Miltenyi Biotec CD8a + T cell isolation kit. Following isolation, CD8 + T lymphocytes were activated with plate-bound 4 µg/mL αCD3 and 4 µg/mL αCD28 and cultured in RPMI-1640 supplemented with 10% heatinactivated FBS, and 1% penicillin-streptomycin. After 48 h, the media was refreshed and 100 U/mL recombinant IL-2 were added. Total RNA was isolated using Direct-zol RNA MiniPrep Kit (Genesee Scientific, 11–330). The qPCR was performed using PerfeCTa® SYBR® Green FastMix (QuantaBio, 101414-270) on a Roche LightCycler 480 (Roche) detection system according to the manufacturer’s instruction. The following primers were used:

IFN-γ (F: GAGTGTGGAGACCATCAAGGAAG, R: TGCTTTGCGTTGGACATTCAAGTC), PD-1 (F: AAGGCGCAGATCAAAGAGAGCC, R: CAACCACCAGGGTTTGGAACTG), GZMB (F: CGACAGTACCATTGAGTTGTGCG, R: TTCGTCCATAGGAGACAATGCCC), GNLY (F: GTACTACGACCTGGCAAGAGCC, R: TCAGACAGGTCCTGTAGTCACG).

Statistical analysis

All statistical analyses were performed using R software (version 4.1.0). The Wilcoxon test was used to analyze differences between two groups, while the Kruskal-Walli’s test was applied to analyze differences among more than two groups. Kaplan-Meier survival curves were compared using the log-rank test. Statistical significance was determined with Wilcoxon tests, and correlation coefficients were calculated using Spearman correlations. P-values < 0.05 were considered statistically significant, with significant P-values denoted as follows: 0 ≤ **** < 0.0001 ≤ *** < 0.001 ≤ ** < 0.01 ≤ * < 0.05.

Results

Overview of CD8 T cell subsets in NSCLC

To uncover the specific subpopulations and potential functions of CD8 + T cells in NSCLC, we analyzed nine datasets from the 10x Genomics platform and created a single-cell transcriptome atlas of CD8 + T cells from 89 patients. After rigorous quality control, the atlas comprised 351,571 CD8 + T cells from primary, metastatic, and normal tissue samples (Supplementary Table 1). T cells from normal tissues, primary tumors, and metastatic sites accounted for 14%, 81%, and 5%, respectively. To decode the transcriptional signals, we clustered the CD8 + T cells and identified ten stable clusters, each with unique signature genes (Fig. 1A). The GZMK + CD8 + Tem subset was the most abundant (Fig. 1B up) but was significantly reduced in both proportion and number in metastatic compared to primary sites (Fig. 1B down; Supplementary Fig. 1A). This subset primarily secretes effector molecules such as GZMK and CST7 (Supplementary Fig. 1B) and shows significant enrichment of MHC class I molecules (Fig. 1C) and TLR signaling (Fig. 1D), typically associated with effector T cells. The IL7R + CD8 + Tm subset was most prevalent in normal tissues but showed a significant decrease in both proportion and number in metastatic compared to primary sites. This subset mainly expresses TOB1 and IL7R and is enriched in cAMP and FoxO signaling pathways. The ID2 + CD8 + Tex subset was more prevalent in metastatic compared to primary sites, characterized by high expression of inhibitory checkpoints such as PD-1 and LAG3 and significant enrichment in the PD-L1/PD-1 signaling pathway, indicating exhausted CD8 + T cells. The LTB + CD8 + Tn subset was the least abundant (Fig. 1B up), with a significant increase in both proportion and number in metastatic compared to primary sites (Fig. 1B down; Supplementary Fig. 1A). However, this subset did not show significant expression or enrichment of inhibitory checkpoints and secretory factors (Fig. 1C and D). Cell-cell communication analysis revealed significant interactions between LTB + CD8 + Tn and GZMK + CD8 + Tem, ID2 + CD8 + Tex, and IL7R + CD8 + Tm. Notably, interactions between LTB + CD8 + Tn and IL7R + CD8 + Tm were most pronounced in metastatic (Fig. 1E) compared to primary sites (Fig. 1F). In summary, our CD8 + T cell atlas not only captured known CD8 + T cell subset characteristics but also revealed potential non-canonical CD8 + T cell subsets.

Fig. 1
figure 1

Overview of CD8 T Cell Subsets in NSCLC. (A). UMAP plot showing clustering of CD8 T cell subsets based on scRNA-seq data. Colors represent distinct subsets identified through marker gene expression. (B). The number and proportion of CD8 T cells. (C). Expression profiles of differentially expressed cytokines, MHC, effector molecules, checkpoint receptors, and receptor. (D). Enriched pathways of each T cell subset. (E). Cell communication analysis of LTB + CD8Tn in primary tumors. (The width of the lines represents the strength of the interactions, while the size of the circles reflects the number of cells in each subset), F. Cell communication analysis of LTB + CD8Tn in metastatic tumors. (The width of the lines represents the strength of the interactions, while the size of the circles reflects the number of cells in each subset).

Identification of key genes in LTB + CD8 T cells

To further uncover the characteristic genes of LTB + CD8 + Tn cells, we analyzed differential genes between metastatic and primary sites within the LTB + CD8 + T cell subset (Supplementary Fig. 2A) and compared these with differential genes between LTB + CD8 + T cells and other subsets across all metastatic sites (Supplementary Fig. 2B). We identified 953 common genes (Supplementary Fig. 2C), which were notably enriched in the glycolysis signaling pathway (Supplementary Fig. 2D-2E). To refine the selection of subset-specific genes, we constructed 117 machine learning prognostic models using the R package Mime1. Among these, the StepCox[forward] + Lasso model showed strong prognostic prediction capabilities across multiple datasets, with AUC values of 0.7, 0.72, and 0.7 for 1-year, 2-year, and 3-year predictions, respectively, in the training sets (Fig. 2A). The ROC curves in Fig. 2B further illustrate the model’s excellent predictive performance. Importantly, the StepCox[forward] + Lasso model demonstrated significant prognostic value in univariate COX regression meta-analyses across several datasets (Fig. 2C; Supplementary Fig. 1F). Four genes-TBCD, PTPRC, LDHA, and ACTR2-were consistently identified as significant for NSCLC across multiple models, including StepCox[forward] + Lasso (Fig. 2D and E). In summary, our analysis using 117 machine learning models identified these four characteristic genes as key drivers of NSCLC metastasis in LTB + CD8 + Tn cells.

Fig. 2
figure 2

Identification of Key Genes in LTB + CD8 T Cells. (A) 117 machine learning models were used to predict the AUC of 1-, 3-, and 5-year prognosis. (B). ROC curves for predicting 1-, 3-, and 5-year outcomes using the StepCox[forward] + Lasso model. (C). The StepCox[forward] + Lasso model demonstrated significant prognostic value in univariate COX regression meta-analyses across several datasets. (DE). Four genes-TBCD, PTPRC, LDHA, and ACTR2-were consistently identified as significant for NSCLC across multiple models.

Functional characterization of LTB + CD8 tn signature

To further elucidate the potential functions of LTB + CD8 + Tn cells, we used the R package GSVA to compute a signature based on the genes TBCD, PTPRC, LDHA, and ACTR2. Enrichment analyses using GO (Supplementary Fig. 3A) and KEGG (Supplementary Fig. 3B) with TCGA transcriptome data revealed significant enrichment in metastasis-related pathways, including cell-substrate junctions, focal adhesion, and the HIF-1 signaling pathway, as well as in glycolysis pathways. Additionally, we utilized Reverse Phase Protein Array (RPPA) data from the TCPA database to calculate pathway activity scores for ten cancer-related pathways. Our computed signature was found to be significantly positively correlated with EMT (Supplementary Fig. 3C). Importantly, in single-cell transcriptome data, the signature also showed a significant positive correlation with hypoxia and invasion (Fig. 3A). We further examined the most prevalent cell types in each microregion of spatial transcriptome sequencing based on deconvolution results. Using the SpatialDimPlot function from the Seurat package, we visualized the dominant cell types for each microregion (Fig. 3B). Spearman correlation analysis was then conducted to evaluate correlations between cell types across all spots, as well as between cell types and the signature or gene expression levels. Visualization was performed using the linkET package. The signature (Fig. 3C), along with LDHA (Fig. 3D) and ACTR2 (Fig. 3E), showed significant negative correlations with CD8 + T cells and positive correlations with tumor cells. In contrast, PTPRC was primarily positively correlated with various immune cells (Fig. 3F), while TBCD was negatively correlated with fibroblasts (Fig. 3G). In summary, our results indicate that the signature derived from LTB + CD8 + Tn metastasis-related genes is significantly positively correlated with invasion and metastasis, and significantly negatively correlated with CD8 + T cells.

Fig. 3
figure 3

Functional Characterization of LTB + CD8 Tn Signature. (A). The signature and pathway activity module present the correlation between GSVA score and pathway activity, which is defined by pathway scores. (B). Using the SpatialDimPlot function in the Seurat software package, the main cell types in each micro region were visualized. (C-G). Spearman correlation analysis was then conducted to evaluate correlations between cell types across all spots, as well as between cell types and the signature or gene expression levels. Visualization was performed using the linkET package.

Prognostic implications of key genes

To further elucidate the metastatic-related characteristic genes of LTB + CD8 + Tn cells, we performed a detailed analysis of TBCD, PTPRC, LDHA, and ACTR2. We found that LDHA and TBCD were significantly upregulated in tumors, while PTPRC was significantly downregulated (Fig. 4A and B). Prognostic analysis revealed that higher expression of LDHA and ACTR2 was associated with poorer outcomes across overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI), and progression-free interval (PFI). Conversely, higher expression of TBCD and PTPRC correlated with better outcomes in OS and DSS (Fig. 4C and F). Further investigation showed that LDHA had a significant positive correlation with clinical pathological stages, while TBCD and PTPRC exhibited a negative correlation with these stages (Fig. 4G and H). Importantly, PTPRC, LDHA, and ACTR2 were significantly positively correlated with activation of the EMT signaling pathway, whereas LDHA and ACTR2 showed significant negative correlations with inhibition of the PI3K/AKT signaling pathway (Fig. 4I). In addition, within the CD8 + T cell populations we analyzed, TBCD, PTPRC, LDHA, and ACTR2 were all significantly positively correlated with inhibitory checkpoints and expressed MHC class I molecules. LDHA, however, was negatively correlated with effector molecules such as GNLY, IL2, and TNF (Fig. 4J). Notably, pathways such as oxidative phosphorylation and NF-κB signaling were significantly enriched (Fig. 4K). To further investigate the role of LDHA in LTB + CD8 + Tn cells, we performed an enrichment analysis and found significant enrichment in the glycolysis signaling pathway, consistent with LDHA’s role as a glycolytic enzyme (Fig. 4L).

Fig. 4
figure 4

Prognostic Implications of Key Genes. (A). Based on the clusterProfiler package; a single gene set enrichment analysis is performed on the tumor group vs. normal group. (B). Box plots showing the expression levels of selected genes, grouped by normal and tumor tissues. Statistical significance of differences between groups was assessed using the Wilcoxon rank-sum test, with p-values indicating differential expression. (CF. Prognostic analysis revealed that higher expression of LDHA and ACTR2 was associated with poorer outcomes across overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI), and progression-free interval (PFI). (G). Bubble plot showing the mRNA expression levels of selected genes across different tumor stages. (H). Line plot showing the mRNA expression trends of selected genes across tumor stages, from early to late. Each line represents a gene, with changes in expression levels highlighting stage-specific regulatory patterns. (I). Expression and pathway activity module estimates the difference in gene expression between pathway activity groups (activation and inhibition), which are defined by median pathway scores. (J). Expression profiles of differentially expressed MHC, effector molecules, and checkpoint receptors based on scRNA-seq data. (K). Bubble plot displaying KEGG pathway enrichment analysis of all cells based on scRNA-seq data. The size of the bubbles indicates the number of genes involved in each pathway, while color intensity reflects the statistical significance of enrichment. (L). Bubble plot displaying KEGG pathway enrichment analysis of LTB + CD8 Tn based on scRNA-seq data.

LDHA: a prognostic biomarker for poor survival in NSCLC

In NSCLC, LDHA was significantly upregulated in tumors compared to paired (Fig. 5A) and non-paired controls (Fig. 5B), demonstrating strong diagnostic capability (Supplementary Fig. 4A). At the protein level, LDHA also showed elevated expression in tumors, consistent with the transcriptomic data (Fig. 5C). Importantly, these findings were validated by immunohistochemistry (IHC) experiments (Supplementary Fig. 4B). Patients were categorized into four groups based on LDHA gene expression levels: Q1, Q2, Q3, and Q4. Q1 represents the top 25% of samples with the highest expression, while Q4 represents the bottom 25% with the lowest expression. Chi-square tests indicated that higher LDHA expression was associated with increased mortality, with statistically significant results (Supplementary Fig. 4C,D). LDHA demonstrated significant prognostic value in TCGA datasets for overall survival (OS), disease-specific survival (DSS), and progression-free interval (PFI) (Fig. 5D and F), with consistent results observed in GEO datasets (Fig. 5G and H; Table 1). NSCLC patients with high LDHA expression had poorer survival outcomes compared to those with low expression, and this correlation was consistent across all four classification groups (Supplementary Fig. 5A-5C). Additionally, in proteomic data, high LDHA expression was associated with worse survival (Fig. 5I). In summary, our results indicate that LDHA, a target related to LTB + CD8 + Tn cell metastasis, is a potential prognostic biomarker for NSCLC.

Fig. 5
figure 5

LDHA: (A) Prognostic Biomarker for Poor Survival in NSCLC. (A). Differences in gene expression between tumor/normal groups (paired differential analysis). (B). Differential gene expression between the cancer/normal groups. (C). Protein level LDHA expression difference (tumor/normal group). (DF). Kaplan-Meier survival analysis for 3 indicators (OS, DSS, and PFI). (GH). Kaplan-Meier survival analysis of GEO data. (I). Kaplan-Meier survival analysis of LDHA in proteomics.

Table 1 Independent prognostic analysis to show whether the gene’s association with tumor survival is independent of traditional clinical variables.

Phosphorylation of LDHA and its clinical relevance

We further performed univariate COX survival analysis for LDHA phosphorylation sites and identified the NP_001158886.1 site as a risk factor associated with increased overall survival mortality (Supplementary Fig. 6A). This site was also found to be expressed at higher levels in tumors (Supplementary Fig. 6B). We calculated the Spearman correlation and p-values between LDHA protein levels and pathway activity scores using the cor.test function. At the protein level, LDHA showed a significant positive correlation with the EMT pathway (Fig. 6A). To analyze tumor cell functional states, we employed the R package GSVA to compute gene set scores. At the RNA level, LDHA expression was significantly positively correlated with hypoxia and invasion (Fig. 6B; Supplementary Fig. 6C). Using the easier package and the “PROGENy” R software package, which calculates scores for 12 cancer-related pathways, we found that high LDHA expression was associated with enrichment in hypoxia, PI3K, and VEGF signaling pathways (Fig. 6C; Supplementary Fig. 6D). In multiple datasets, increased LDHA expression correlated with advancing disease stage, suggesting a potential role of LDHA in NSCLC progression (Fig. 6D and F). Importantly, in single-cell transcriptome data, LDHA was significantly positively correlated with hypoxia and invasion (Fig. 6G). These findings suggest that LDHA, a target linked to LTB + CD8 + Tn cell metastasis, could be a valuable biomarker for predicting NSCLC metastasis.

Fig. 6
figure 6

Phosphorylation of LDHA and Its Clinical Relevance. (A). Correlation between LDHA expression and pathway level quantification of functional proteins by TCPA-RPPA sequencing. (B). Pearson correlation of GSVA scores between zscores of LDHA expression level and tumor states. (C). Difference of pathway score in high/low LDHA expression group. (DF). Different LDHA expression at different stages. (G). The correlation between LDHA and pathway, which is defined by pathway scores.

High LDHA expression linked to reduced CD8 + T cell infiltration and Immunotherapy Resistance

We examined the relationship between LDHA expression and immune infiltration, as well as genomic states, by analyzing scores related to immunogenicity and DNA damage. We found that increased LDHA expression was associated with lower TCR and lymphocyte infiltration scores, while migration and proliferation scores were higher (Fig. 7A). Various algorithms were used to assess the correlation between CD8 + T cell content and LDHA expression (Fig. 7B). A consistent negative correlation was observed between LDHA and CD8 + T cell infiltration; as LDHA levels increased, CD8 + T cell infiltration decreased. In multiple datasets, LDHA was significantly negatively correlated with cytotoxic T lymphocytes (CTLs) (Fig. 7C and E). Notably, in the low LDHA expression group, higher CTL levels were associated with better prognosis. However, in the high LDHA expression group, higher CTL levels did not correlate with improved prognosis and even suggested a trend toward worse outcomes with increasing CTL levels. This indicates that LDHA may be related to T cell exhaustion (Fig. 7F-H). In a melanoma immunotherapy cohort, high LDHA expression was associated with poor response to immunotherapy (Fig. 7I). Similarly, in datasets from various cancers, including NSCLC, higher LDHA expression correlated with a higher probability of resistance to immunotherapy (Fig. 7J-M). Spearman correlation analysis of the PRISM and CTRP databases revealed that LDHA expression was positively correlated with the area under the dose-response curve (AUC) for the EGFR mutation inhibitor afatinib, indicating that higher LDHA levels were associated with reduced sensitivity to afatinib (Supplementary Fig. 7A- 7B).

Fig. 7
figure 7

High LDHA Expression Linked to Reduced CD8 T Cell Infiltration and Immunotherapy Resistance. (A). From left to right, the heat map represents the intra-group mean of each immune response score and genome status score for subtypes Q1, Q2, Q3, and Q4. The heatmap is standardized by row so that each score is scaled to the same range. (B). Various immune cell infiltration algorithms reveal the relationship between LDHA and CD8 T cell infiltration. (CE). Correlation between LDHA and CTLs in GSE9893, GSE31245, and Lauss2017_ACT. (FH). The Kaplan-Meier curve illustrates the relationship between LDHA and T cell exhaustion in GSE17536, GSE51088, and Nutt_Glioma. ( I). Relationship between LDHA and immunotherapy efficacy in the Nathanson 2017 dataset. (JM). Chi-square test to evaluate the significance of the difference in immunotherapy response rates between high and low gene expression groups (phs000452, PRJNA482620, E-MTAB-6270, and GSE126044).

Poor prognosis in NSCLC Associated with High LDHA and low IL7R or GNLY expression

Previous data indicated significant cell interactions between LTB + CD8Tn and IL7R + CD8Tm. We performed survival analysis based on LDHA and IL7R expression levels. The group with high LDHA and low IL7R expression showed the poorest prognosis (Fig. 8A, B, E and F). Additionally, our earlier data revealed that increased LDHA, a metastasis-related target for LTB + CD8Tn, was associated with a marked decrease in the secretion of the factor GNLY. We conducted survival analysis based on LDHA and GNLY expression and found that the group with high LDHA and low GNLY expression had a worse prognosis compared to the group with low LDHA and high GNLY expression (Fig. 8C, D, G and H). These results were consistently validated across multiple datasets.

Fig. 8
figure 8

Poor Prognosis in NSCLC Associated with High LDHA and Low IL7R or GNLY Expression. (A,B). Molecular subtype for LDHA and IL7R in TCGA. (a) Z-score scatter plot of samples, where each point represents a sample and different colors indicate different subgroups. The x and y axes represent the Z-scores of two genes, with Z-score ≤ 0 indicating low expression and Z-score > 0 indicating high expression. (b) Kaplan-Meier survival analysis, with the x-axis representing survival time (t) and the y-axis showing the probability of survival beyond time t. The Log-rank test assesses differences between survival curves, with a p-value < 0.05 considered significant. Pairwise Log-rank test results are presented in a table with a gray background, and the overall Log-rank test result is shown in the bottom left corner. (C,D). Molecular subtype for LDHA and GNLY in TCGA. (E,F). Molecular subtype for LDHA and IL7R in GSE72094. (G,H). Molecular subtype for LDHA and GNLY in GSE72094.

Considering that IHC may not accurately reflect the expression of LDHA in tumor cells and immune cells, we conducted an analysis using single-cell dataset (Fig. 9A). Our findings revealed that LDHA is highly expressed in both tumor cells and CD8 + T cells, with the highest expression observed in CD8 + T cells in the GSE117570 dataset (Fig. 9B). To further investigate, we isolated CD8 + T cells from human peripheral blood, activated them with CD3/CD28, and then introduced lactate as an intervention. PCR analysis revealed that, compared to the control group, lactate treatment significantly downregulated CD8 + T cell activation markers such as GNLY, GZMB, and IFN-γ, while upregulating the exhaustion marker PD-1 (Fig. 9C). More importantly, flow cytometry analysis demonstrated a marked decrease in IFN-γ and a significant increase in PD-1 in CD8 + T cells treated with lactate (Fig. 9D). LDHA plays a key role in lactate production, and through basic experiments, we demonstrated that lactate indeed inhibits the effector function of CD8 + T cells and induces T cell exhaustion.

Fig. 9
figure 9

Validation through basic experiments. (A). UMAP plot showing clustering of cell subsets based on scRNA-seq data (GSE117570). (B). Expression profiles of LDHA. (C). RNA level changes were detected by PCR after 24 h of lactate treatment (10mM). (D). Changes in IFN-γ and PD-1 were detected by flow cytometry after 24 h of lactate treatment (10mM).

Discussion

CD8 + T cells are pivotal in tumor immunity, and understanding their heterogeneity is crucial for grasping tumor progression and developing new therapeutic strategies19. However, most studies to date have focused on single subpopulations or specific targets, limiting the comprehensive understanding of CD8 + T cells. This study systematically examined the distribution and functional differences of various CD8 + T cell subsets in primary and metastatic non-small cell lung cancer (NSCLC) by analyzing single-cell transcriptomic data from 89 patients, particularly highlighting the differential expression patterns of GZMK + CD8Tem and ID2 + CD8Tex subsets. We performed an in-depth analysis of the LTB + CD8Tn subset, discovering a significant increase in these cells in metastatic sites and notable communication with other effector CD8 + T cell subsets, such as GZMK + CD8Tem and IL7R + CD8Tm. These finding challenges existing literature, which often considers LTB + CD8 T cells as non-effector or functionally unclear19,20, and suggests that LTB + CD8Tn may have a unique function in the tumor microenvironment.

Additionally, we utilized multiple machine learning models to identify LTB + CD8Tn key genes linked to NSCLC metastasis. Out of the 117 prognostic models built using the R package Mime1, the StepCox[forward] + Lasso model exhibited excellent predictive performance across multiple datasets, with AUC values consistently exceeding 0.7. This result was validated in the training set and further confirmed through univariate COX regression meta-analysis across various datasets, demonstrating significant prognostic value. Compared to studies based on single models21,22, our multi-model approach more comprehensively captured biologically significant feature genes, yielding more reliable predictions. From the multi-model intersection analysis, we identified four key genes-TBCD, PTPRC, LDHA, and ACTR2-that consistently showed high weights across several prognostic models, bolstering their reliability as potential biomarkers. Through multimodal data analysis, including bulk transcriptomics, single-cell transcriptomics, and spatial transcriptomics, we found that the expression of LTB + CD8Tn-associated genes was significantly positively correlated with NSCLC invasion and metastasis, and negatively correlated with CD8 + T cell levels, underscoring these genes’ importance in the tumor microenvironment.

We further explored the expression and prognostic relevance of TBCD, PTPRC, LDHA, and ACTR2 in NSCLC. Previous studies have demonstrated that TBCD plays a critical role in microtubule dynamics and tumor cell migration23,24, PTPRC (CD45) is vital for immune regulation and is typically associated with immune responses25,26, ACTR2 is involved in cytoskeletal remodeling and tumor metastasis27,28, and LDHA, a key glycolytic enzyme, is closely linked to poor prognosis in various cancers29,30. Our findings indicated that LDHA and TBCD are significantly overexpressed in NSCLC tumor tissues, with LDHA associated with worse overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI), and progression-free interval (PFI), whereas high TBCD and PTPRC expression correlated with improved OS and DSS. LDHA was also positively associated with clinical pathological stages and inversely related to the PI3K/AKT signaling pathway, highlighting its crucial role in tumor metabolism. Moreover, LDHA was positively correlated with inhibitory checkpoints and MHC class I molecules, suggesting involvement in immune evasion mechanisms, while negatively correlated with effector molecules such as GNLY, IL2, and TNF, further supporting its complex role in tumor progression. Significant enrichment in oxidative phosphorylation and NF-κB signaling pathways also indicated LDHA’s key role in signal transduction and metabolic reprogramming.

As a key enzyme in glycolysis, LDHA is widely implicated in tumor cell growth and metastasis across various cancers. However, its role in CD8 + T cells has been underexplored. This study is the first to highlight the critical role of LDHA in CD8 + T cells, particularly in LTB + CD8Tn cells in NSCLC. We found that LDHA was significantly overexpressed in NSCLC tumor tissues, validated at both protein levels and in immunohistochemistry experiments. Additionally, high LDHA expression was significantly associated with poor survival outcomes in TCGA and GEO datasets, emphasizing its potential as a prognostic biomarker. LDHA notably enhanced tumor cell invasion and metastasis by increasing lactate production31. Further analysis in this study revealed that LDHA in LTB + CD8Tn cells activated hypoxia, PI3K, and VEGF signaling pathways through lactate production, significantly promoting NSCLC invasion and metastasis. These findings open new avenues for developing targeted therapies against LDHA.

High LDHA expression is closely tied to tumor progression and significantly suppresses CD8 + T cell function, particularly CTL activity32. Lactate accumulation suppresses T cell function, enhancing tumor cell growth and metastasis33. In our study, LDHA was closely associated with immune infiltration and genomic status in NSCLC, particularly in LTB + CD8Tn cells. Increased LDHA expression led to decreased TCR and lymphocyte infiltration scores, while migration and proliferation scores increased. Multiple algorithmic evaluations revealed a stable negative correlation between LDHA and CD8 T cell infiltration, particularly CTLs, where high LDHA expression was linked to a significant reduction in CTLs. This suggests LDHA may promote tumor immune evasion by suppressing CD8 T cell function. Importantly, high LDHA expression was associated with T cell exhaustion. In the low LDHA expression group, high CTL levels were associated with good prognosis, whereas in the high LDHA expression group, increased CTLs did not improve prognosis and even worsened it, supporting LDHA’s role in immune evasion. Moreover, high LDHA expression was linked to non-response to immunotherapy in melanoma cohorts and suggested higher non-response rates to immunotherapy in multiple cancers, including NSCLC. This indicates that LDHA could be a crucial predictor of immunotherapy outcomes. We also found that high LDHA expression correlated with reduced sensitivity to the EGFR inhibitor afatinib, further suggesting LDHA’s potential role in tumor resistance. The combination of high LDHA and low IL7R expression predicted the poorest prognosis, and high LDHA expression was associated with significantly reduced GNLY secretion, further supporting LDHA’s role in suppressing immune responses and promoting tumor progression. Recent studies have underscored the functional impact of LDHA-mediated metabolites, particularly lactate, on immune cell function. In CD8 + T cells, lactate accumulation within the tumor microenvironment not only suppresses TCR signaling and effector functions but also contributes to the polarization of CD8 + T cells towards an exhausted phenotype. This metabolic shift diminishes cytokine production, including IFN-γ and TNF-α, thereby weakening the cytotoxic capacity of CD8 + T cells. Furthermore, lactate has been shown to modulate the expression of immune checkpoints, such as PD-1, exacerbating immune suppression34,35. Beyond lactate, other glycolysis-related metabolites regulated by LDHA, such as pyruvate and NADH, play crucial roles in energy homeostasis and reactive oxygen species (ROS) generation, which further influence CD8 + T cell differentiation and function. For instance, elevated ROS levels can impair CD8 + T cell proliferation and viability, while also enhancing oxidative stress pathways that promote immune evasion. The dual role of these metabolites highlights the complex interplay between metabolic reprogramming and immune suppression within the tumor microenvironment. Our findings, therefore, position LDHA as a central regulator of metabolite-driven immune suppression in NSCLC, providing a compelling rationale for targeting LDHA and its downstream metabolic pathways to restore CD8 + T cell functionality and enhance antitumor immunity. This study systematically reveals for the first time that the LTB + CD8Tn-associated target molecule LDHA suppresses CD8 T cell function and promotes immune evasion in NSCLC through lactate production. These findings provide new insights and foundations for developing LDHA-targeted therapies in the future.

Conclusion

This study systematically uncovers the heterogeneity of CD8 + T cells in the NSCLC tumor microenvironment, particularly the unique role of the LTB + CD8Tn subset in metastasis. Through multi-model machine learning screening and multimodal data analysis, we identified LDHA as a key gene associated with NSCLC metastasis, emphasizing its potential as a therapeutic target. These findings enrich the functional landscape of CD8 + T cells and offer new biomarkers and targeted strategies for personalized immunotherapy. The study has significant clinical implications, highlighting LDHA’s critical role in tumor immune evasion and its potential application in future immunotherapy. Future research should further explore LDHA’s role in other tumors and develop innovative therapeutic strategies targeting it to improve patient outcomes.