Fig. 1: differentially methylated CpGs between ALL survivors in the NPG discovery group and healthy controls and predictive efficacy of differentially methylated sites.

A 452 CpGs (red) that met genome-wide significance (q-value < 0.05; db > 5%) in testing for differential methylation between ALL survivors (NPG discovery) and controls. B PCA at 33 CpGs, derived from filtering the 452 differentially methylated CpGs, cluster controls from ALL survivors in NPG discovery (n = 91) and controls (n = 70), and indepedent samples (PETALE all survivors without CRT, n = 25 and additional control samples, n = 50). PC1 accounts for 53% of variability across samples at these 33 CpGs. All samples plotted were used as training samples of RF model. C Probability scores from RF model demonstrate 98% sensitivity and 85% specificity in 110 additional ALL survivors (PETALE, PETALE CRT) and 394 controls (controls Weksberg and control GEO). All NPG validations (n = 49) were classified correctly. The threshold was set to 50% as indicated by the dashed line. D Applying the classification model to St. Jude's LIFE cohort (GSE169156), with the cohort split into two groups. “SJ leukemia” (n = 510) were individuals presumed to be treated for leukemia or lymphoma and “SJ other” (n = 1023) were individuals treated for all other cancers reported in this cohort. RF probability scores in SJ leukemia (99% positive) and SJ other (77%) were significantly different.