Fig. 2

Identification and analysis of candidate genes. (A) Forest plot of Univariate Cox regression analysis. (B) LASSO regression analysis plot. left graph: plot of penalty term parameters, abscissa is log (lambda) value, ordinate is degrees of freedom, representing the error of cross validation. The dashed line on the left represents the position where the cross-validation error is minimal. The corresponding optimal log (Lambda) value is determined at this position (lambda.min), and the number of feature genes is displayed above. The corresponding genes and their coefficients are found on the right plot. Right graph: The abscissa is log(lambda), the ordinate is the coefficient of the gene, and the change of the coefficient of different variables after being penalized with λ. (C) Expression of prognostic genes in TCGA-LUAD (top graph) and GSE50081 (bottom graph). (D) Risk curves according to TCGA-LUAD (top graph) and GSE50081 (bottom graph). The blue line represents the low-risk group, and the red line represents the high-risk group (distinguished by the median). The blue circle represents the survival sample (Alive), and the red dot represents the non-survival sample (Death). (E) Kaplan–Meier survival analysis by different gene expression levels of High-risk group and low-risk group according to TCGA-LUAD (left graph) and GSE50081 (right graph). The abscissa is the survival time, the ordinate is the survival rate, and the Number in the coordinate axis (Number) below the figure is the number of surviving samples in the corresponding survival period. (F) ROC analysis according to TCGA-LUAD (left graph) and GSE50081 (right graph). The abscissa is the false positive rate, the ordinate is the true positive rate, and the area under the curve called Area Under Curve (AUC) represents the prediction accuracy.