Fig. 2: Development and validation of a DNA methylation-site model for bladder cancer diagnosis.

A Correlation heatmap showing the relationship between DNA methylation levels and gene expression for six genes. The color intensity represents the Spearman correlation coefficient (ρ), with darker colors indicating stronger correlations. Significant negative correlations (Spearman ρ < −0.3, p < 0.05) are highlighted. B-1 LASSO regression coefficients for the selected methylation sites. The graph shows the coefficients as a function of the log(lambda), with the optimal lambda (λ.min) indicated by the red vertical line. B-2 Cross-validation error rates for the LASSO regression model. The red dashed line indicates the optimal lambda (λ.min), which minimizes the mean squared error. C-1 Receiver Operating Characteristic (ROC) curves for individual methylation sites. The area under the curve (AUC) values are provided for each site, indicating their diagnostic performance. C-2 Decision curve analysis (DCA) for the logistic regression model incorporating four methylation sites vs ALL model with seven methylation sites. C-3 ROC curve for the logistic regression model using four methylation sites. The AUC value is 0.952, indicating excellent diagnostic performance. C-4 Box plots comparing model scores between normal and tumor samples in the TCGA-BLCA dataset. C-5 Box plots comparing expression levels of specific methylation sites (cg10395685, cg16536329, cg23037403, cg02165355) between normal and tumor samples across different groups (g1-g5). D-1 ROC curves for individual methylation sites in the validation dataset (GSE120288). The AUC values are provided for each site, demonstrating their diagnostic performance. D-2 ROC curve for the logistic regression model using four methylation sites in the validation dataset. The AUC value is 0.972, indicating excellent diagnostic performance. D-3 Decision curve analysis (DCA) for the logistic regression model in the validation dataset. D-4 Box plots comparing model scores between normal and tumor samples in the GSE120288 dataset. D-5 Box plots comparing expression levels of specific methylation sites (cg10395685, cg16536329, cg23037403, cg02165355) between normal and tumor samples across different groups (g1-g5) in the validation dataset.