Fig. 2

Gene set enrichment analysis (GSEA) between clusters and key differential lactylation-related gene selection using machine learning. (A) GSEA plots comparing the enrichment of hallmark pathways between the two molecular clusters identified based on lactylation-related genes. (B) Least absolute shrinkage and selection operator (LASSO) regression coefficient profiles depicting the trajectory of each gene as a function of log(λ). (C) Ten-fold cross-validation for LASSO regression showing the relationship between mean cross-validated error and log(λ). (D) Bar plot of key variables selected by LASSO model. (E) Random forest classifier error plot illustrating the relationship between the number of trees and the out-of-bag (OOB) error rate for model optimization and stability assessment. (F) Top five important genes in the random forest (RF) model identified based on variable importance scores. (G) Top five important genes in the extreme gradient boosting (XGBoost) model identified according to their importance in the model. (H) Venn diagram displaying the overlap of key lactylation-related genes identified by the LASSO, random forest (RF), and XGBoost machine learning algorithms.