Fig. 8: Development and evaluation of the overall survival predictive model.

a The workflow of development of the overall survival predictive model based on the RCD.Sig with 101 combinations of 10 machine-learning algorithms. The details were summarized in methods. b The left panel was a heat map, showing the C-index of 101 combinations in training dataset, testing dataset 1 and testing dataset 2. The right panel was two bar plots, showing the mean C-index of the testing datasets and the mean C-index of the three datasets. The stepcox [“both”]+RSF with the highest mean C-index was considered as the final RCD.Sig survival predictive model. c–f K–M curves comparing the overall survival between the patients with high risk and the patients with low risk in training. dataset, testing dataset 1, testing dataset 2, and TCGA, which was an amalgam of training dataset, testing dataset 1, testing dataset 2. The risk score was calculated by the optimal survival predictive model. The patients in the cohort were divided into high-risk group and low-risk group based on the median of the risk score. g The correlation of the risk score computed by the optimal survival predictive model and the clinical outcomes, including OS, DFI, DSS, and PFI, in TCGA, using the univariate Cox regression analysis and the log-rank test. The p value < 0.05 and HR > 1 was considered as risky. The p value ≥ 0.05 was considered as nonsense. “N/A” indicated that the corresponding data was missing. h The forest plot showing the univariate Cox regression result of the risk score in all TCGA dataset. TCNS tumors of central nervous system, TT thoracic tumors, TDST tumors of digestive system, BT breast tumor, TUS tumors of urinary system and male genital organs, TFRO tumors of female reproductive organs, TEO tumors of endocrine organs, TO tumors of others. I Meta-analysis of the comprehensive prognostic performance of the risk score in these 23 cohorts.